Re: vector strided stores when rs1=x0
toggle quoted messageShow quoted text
I think this is a bad idea for both loads and stores. If the intent is a single load or single store, then there should be another way to do it.
Using vector loads/stores with stride=0 is one way to read/write a vector from/to a memory-mapped FIFO. (I think we also discussed a way to do ordered writes for such cases earlier, which is necessary for FIFO-based communication; I don't recall whether this was discussed around strides. If there is a special way to declare ordered writes, then I'm only concerned with using a FIFO with that mode.)
On Mon, Nov 9, 2020 at 8:38 AM Krste Asanovic <krste@...
Also on github as issue #595
In our earlier TG discussion in 9/18 meeting, we were in favor of
allowing vector strided load instructions with rs1=x0 to perform fewer
memory accesses than the number of active elements. This allows
higher-performing splats of a scalar memory value into a vector.
In writing this up, I inadvertently made this true for stores too.
But on review, I can't see a reason to not also allow strided stores
(which are now unordered), to also perform fewer memory operations (in
effect, picking a random active element to write back). The behavior
is indistinguishable from a possible legal execution of prior scheme,
and has potential niche use of storing element value to memory when it
is known all elements have same value.
I suppose we could also reserve the encoding with strided stores of
rs1=x0, but this would add some asymmetry. Software could then get a
similar effect by settng vl=1 before the store.
Join firstname.lastname@example.org to automatically receive all group messages.