Re: RISC-V Vector Extension post-public review updates

Bruce Hoult

On Thu, Nov 18, 2021 at 10:33 AM Bill Huffman <huffman@...> wrote:



From: Bruce Hoult <bruce@...>
Sent: Wednesday, November 17, 2021 4:24 PM
To: Krste Asanovic <krste@...>
Cc: Bill Huffman <huffman@...>; Grigorios Magklis <grigorios.magklis@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates



Don't forget some code may want to use a mask in inverted sense for individual instructions, without explicitly creating a new mask. This was not listed in the "wish list for 64 bits" below, but it was in early RVV drafts.


Yes, that needs to be considered as well.


I'm not sure how common that really is, and non-store uses can usually just use a vmerge.vmm at the end anyway, at the expense of possibly using extra registers.


While on the subject of future features, and somewhat related ... the one big thing I've noticed RVV lacking that SVE has is a non-faulting version of indexed loads ("gather") which creates a mask showing which elements were accessible. In SVE this goes into a CSR which can then be moved into a mask register, but of course with sufficient encoding bits you could directly put it into a normal register.


Traditional vector code doesn't really need this, but SVE has an aim to be able to vectorise all loops.


How does this contribute to vectorizing all loops?

Because otherwise you can't safely vectorise loops that do indirect array accesses (e.g. a[b[i]]) with data-dependent control flow.

 I think this was not included for security reasons rather than ignored.

I don't think there is any additional security implication.

I could be wrong, as I'm not an expert on SVE, but I believe that even if the gather operation is done (somewhat) in parallel or in random order, the instruction doesn't actually return a mask indicating all the failed accesses. All mask bits after the first element that was inaccessible are also set to false. The following code will process all the initial elements and then invert the mask and loop back and try to process the tail starting from the first inaccessible element, which will then actually fault if the loop didn't exit or skip that element based on program logic.

Join { to automatically receive all group messages.