Re: make SEW be the largest element width
First thoughts below
- as a response to #424 and
- we need only consider SEW = LEW (new) and 1/2LEW (POR)
On 2020-04-19 1:00 a.m., Krste Asanovic wrote:
When completed is it LEWd?I added my proposal to github: https://github.com/riscv/riscv-v-spec/issues/425 appended below for those not following github Krste This proposal is a modification of earlier idea to add effective element width to load/store instructions to mitigate dropping fixed-width load/stores and to provide greater efficiency for mixed-width floating-point codes. This proposal redefines SEW to be the largest element width (LEW?),
and correspondingly the definition of widening/narrowing operations: Previously a double-widening add was defined as: 2*SEW = SEW + SEW the new proposal is to specify SEW = SEW/2 + SEW/2
With the corresponding lmul = lmul/2 + lmul/2
Which biases towards fractional lmul and thus interleaved vs striped groupings, a direction I endorse.
(I expect it is not surprise that I lean towards interleave and enhancing fractional LMUL.
Hopefully to the point that striped LMUL is a secondary mechanism
if not obsoleted)
Thus SEW as LEW was one of my preferred options.
SEW at 1/2LEW is the current POR, and has the advantage that the majority of operations are defined and performed at this SEW level.
Conditioning data for widening ops and setting masks occurs at the 1/2LEW level.
This assumes the dominant widening operations are double and not
quad/octal. I have little reason to suspect otherwise.
Straight conversion to 1/2LEW from 1/4 and 1/8 is also important,
but I have little reason to suspect they are so important as to
compromise the efficiency of other operations.
Thus, I believe SEW at 1/4LEW and 1/8 LEW are of little value much too removed from where the real work occurs.
Once again I postulate that the best choice is determined by the programs activities as both SEW = LEW and 1/2LEW have merit.
As a response to #424 this is may be a tactical acceptance of my assertion:
However, it may be only addressing the limitations/tradeoff inherent with POV load/store and recently proposed packed fixed load store and subsequent widening mechanisms. All good stuff, but not a general as my KEY POINT.
More comments to come.