Re: Vector Task Group minutes 2020/5/15
that may be
On 2020-05-19 7:14 p.m., Bill Huffman wrote:
I believe it is provably not possible for our vectors to have more thanFor me the definitions contained in 1,2 and 3 need to be more rigorously defined before I can agree that the constraints/behaviours described are provably inconsistent on aggregate..
This one is particularly difficult for me to formalize.
When vl = vlen * lmul, (for lmul 2,4 or 8) then cache lines can be requested in an order such that when they arrive corresponding segments can be filled.
So, I'm not sure if the focus here is an efficiency concern?
3. The operation corresponding to storing a register group at oneWhat we can definitely prove is that a specific design has specific characteristics and eliminates other characteristics.
I agree that the current design has the characteristics you describe.
However, for #3, I appears to me that a facility that clusters elements of smaller than a certain size still allows behaviours 1 and 2.
Further,for element lengths up to that cluster size in-register order matches the in-memory order.
I also agree reconsidering #2
It would cost additional logic in wide, sliced datapaths to keep upHere I believe is where you introduce efficacy in implementation.
Once implementation design considerations are introduced the proof becomes much more complex;
Compounded by objectives and technical tradeoffs and less a mathematics rigor issue .
But the damage might be less than requiring casts andI also agree with you that reconsidering #2 can lead to conceptually simpler designs that perhaps will result in less eco fragmentation.
However, anticipating a communities response to even the smallest of changes is crystal ball material.
There are many variations and approaches still open to us to address in-register and in-memory order agreement, and to address widening approaches (in particular, interleaving or striping with generalized SLEN parameters).
I'm still waiting on the proposed casting details. If that resolves all our concerns, great.
In the interim I believe it may be worthwhile exercises to consider equivalences of functionality.
Specifically, vertical stripping vs horizontal interleave for widening ops, in-register vs in-memory order for element width alignment.
I hope that the more we identify the easier it will be to compare them and evaluate trade-offs.
I also think it constructive to consider big-endian vs little-endian with concerns about granularity (inherent in big endian and obscured with little-endian (aligned vs unaligned still relevant))