Re: Vector TG meeting minutes 2020/4/03
I agree Nick.
So here is a suggestion, not completely facetiously:
For load byte/half/word
example when SEW = 64
An implementation can optimize the sequence
strided load by 1/2/4
shift left 56/48/32
arith right 56/48/32
but a sign extend byte/half/word to SEW would make fusing/chaining simpler.
And these without widening.
a “pack” SEW (of byte/half/word) instruction by SLEN into appropriate LMUL=1/8, 1/4 or 1/2 would allow standard unit strided store to work.
A fractional LMUL that uses interleave (rather than right justified
SLEN chunks) would not need this pack instruction.
On 2020-04-04 8:04 p.m., Nick Knight wrote: