Re: make SEW be the largest element width

David Horner

First thoughts below

- as a response to #424 and

- we need only consider SEW = LEW (new) and 1/2LEW (POR)


On 2020-04-19 1:00 a.m., Krste Asanovic wrote:
I added my proposal to github:

appended below for those not following github


This proposal is a modification of earlier idea to add effective
element width to load/store instructions to mitigate dropping
fixed-width load/stores and to provide greater efficiency for
mixed-width floating-point codes.

This proposal redefines SEW to be the largest element width (LEW?),
When completed is it LEWd?
and correspondingly the definition of widening/narrowing operations:

Previously a double-widening add was defined as:
           2*SEW = SEW + SEW
the new proposal is to specify
             SEW = SEW/2 + SEW/2

With the corresponding lmul = lmul/2 + lmul/2

Which biases towards fractional lmul and thus interleaved vs striped groupings, a direction I endorse.

(I expect it is not surprise that I lean towards interleave and enhancing fractional LMUL.

Hopefully to the point that striped LMUL is a secondary mechanism if not obsoleted)

Thus SEW as LEW was one of my preferred options.

SEW at 1/2LEW is the current POR, and has the advantage that the majority of operations are defined and performed at this SEW level.

Conditioning data for widening ops and setting masks occurs at the 1/2LEW level.

This assumes the dominant widening operations are double and not quad/octal. I have little reason to suspect otherwise.

Straight conversion to 1/2LEW from 1/4 and 1/8 is also important, but I have little reason to suspect they are so important as to compromise the efficiency of other operations.

Thus, I believe SEW at 1/4LEW and 1/8 LEW are of little value much too removed from where the real work occurs.

Once again I postulate that the best choice is determined by the programs activities as both SEW = LEW and 1/2LEW have merit.

As a response to #424 this is may be a tactical acceptance of my assertion:

KEY POINT: A local optimal SEW:LMUL with exceptions is I believe possible.

        Scaled load/store goes a long way to achieve that by providing such exceptions in a key activity.

        However, I believe we may need another exception mechanism.

REQUEST: What I am hoping is that we can get consensus on the above KEY POINT. Then, move on to an efficient exception mechanism.

However, it may be only addressing the limitations/tradeoff inherent with POV load/store and recently proposed packed fixed load store and subsequent widening mechanisms. All good stuff, but not a general as my KEY POINT.

More comments to come.

Join to automatically receive all group messages.