Re: Vector Memory Ordering
David Horner
On 2020-09-04 5:53 p.m., Andrew
Waterman wrote:
Just a quick note here, if Ztso is active then "truly in order" has global ordering too. Much of the discussion was driven by consideration for the difficulty in runtime overlap detection and challenges to ensure "correct" function.
Thinking was 1) such a restriction would be addressed by application fall back
to vl=1 iterations 2) such a restriction could be relaxed later and thus defer addressing all permutations such as this. Thinking was misaligned stride would also be restricted with the same fallback.
I believe misaligned stride could be very valuable for load, less valuable for store. Would you please elaborate on the good reason(s) for misaligned stride? But it Thinking was misaligned stride would also be restricted with the same fallback. such a restriction could be relaxed later and thus defer
addressing all permutations such as this.
I am in favour of arbitrary reordering, but I'd prefer to refer
to such as parallel execution; memory ordering is a distinct issue
and often conflated with vector element processing sequences. yes. It was noted that this is already allowed by some instructions and explicitly by the relaxed (point 4) definition of precise vector traps. potentially, so avoid the situation as in other instructions where vrestart is desired but costly, difficult or impossible to implement without the restriction.
|
|