Vector TG Meeting tomorrow


David Horner
 

I apparently missed the meeting that I thought was at noon eastern.

There are of course the remaining open for v1.0 issues.

I gather what was discussed was if we could reasonably move to public review without finalizing all of them.

To the extent that addendums could be added, such as the table of "reserved" equivalent instructions, the usage section and the prior art section I agree that these could be added in parallel with the review.
The other items I mentioned... imprecise intent wording... and the explanation of the encoding decisions.... both of these I think warrant a delay to have them in place for public consumption.

On Fri, Jul 9, 2021, 11:10 mark, <markhimelstein@...> wrote:
just  to qualify, I think we are talking about RVA22 (application target) and not RVM22 (microcontroller target).

On Fri, Jul 9, 2021 at 8:07 AM Jan Wassenberg via lists.riscv.org <janwas=google.com@...> wrote:
Mentioned by Krste in the meeting: processor profile already requires VLEN >= 128.

On Fri, Jul 9, 2021 at 2:51 PM Jan Wassenberg via lists.riscv.org <janwas=google.com@...> wrote:
A topic to discuss: lower bound on VLEN.

The upper bound is helpful but even VL-agnostic code sometimes wants at least 128 bits.
Example: N parallel instances of AES (16 bytes each), or N 128-bit results from 64x64 normal or carryless multiplication.

We can get this already (assuming SEW_LMUL1MAX = 64) by setting LMUL=2, but it seems like a poor tradeoff that
software should halve the number of registers/groups, just so that hardware could theoretically have single-element vectors.

Can we mandate VLEN >= 2*SEW_LMUL1MAX, perhaps in a profile? That would help software :)

BTW, are we intending to have the same binaries work on different implementations? It seems the only way to discover SEW_LMUL1MAX
is to try various SEW/LMUL and check for vill. Because LMUL is baked into the intrinsic function name,
software that wants portable binaries would have to recompile all vector code for LMUL=1,2,4,8, and then
pick the first one that works.

That's very burdensome, a profile guaranteeing SEW_LMUL1MAX = 64 or at least LMUL2MAX = 64 would also help a lot.

On Fri, Jul 9, 2021 at 6:58 AM Krste Asanovic <krste@...> wrote:
We’ll meet tomorrow to see if there are any remaining concerns before going Into public review,
Krste







mark
 

just  to qualify, I think we are talking about RVA22 (application target) and not RVM22 (microcontroller target).

On Fri, Jul 9, 2021 at 8:07 AM Jan Wassenberg via lists.riscv.org <janwas=google.com@...> wrote:
Mentioned by Krste in the meeting: processor profile already requires VLEN >= 128.

On Fri, Jul 9, 2021 at 2:51 PM Jan Wassenberg via lists.riscv.org <janwas=google.com@...> wrote:
A topic to discuss: lower bound on VLEN.

The upper bound is helpful but even VL-agnostic code sometimes wants at least 128 bits.
Example: N parallel instances of AES (16 bytes each), or N 128-bit results from 64x64 normal or carryless multiplication.

We can get this already (assuming SEW_LMUL1MAX = 64) by setting LMUL=2, but it seems like a poor tradeoff that
software should halve the number of registers/groups, just so that hardware could theoretically have single-element vectors.

Can we mandate VLEN >= 2*SEW_LMUL1MAX, perhaps in a profile? That would help software :)

BTW, are we intending to have the same binaries work on different implementations? It seems the only way to discover SEW_LMUL1MAX
is to try various SEW/LMUL and check for vill. Because LMUL is baked into the intrinsic function name,
software that wants portable binaries would have to recompile all vector code for LMUL=1,2,4,8, and then
pick the first one that works.

That's very burdensome, a profile guaranteeing SEW_LMUL1MAX = 64 or at least LMUL2MAX = 64 would also help a lot.

On Fri, Jul 9, 2021 at 6:58 AM Krste Asanovic <krste@...> wrote:
We’ll meet tomorrow to see if there are any remaining concerns before going Into public review,
Krste







Jan Wassenberg
 

Mentioned by Krste in the meeting: processor profile already requires VLEN >= 128.


On Fri, Jul 9, 2021 at 2:51 PM Jan Wassenberg via lists.riscv.org <janwas=google.com@...> wrote:
A topic to discuss: lower bound on VLEN.

The upper bound is helpful but even VL-agnostic code sometimes wants at least 128 bits.
Example: N parallel instances of AES (16 bytes each), or N 128-bit results from 64x64 normal or carryless multiplication.

We can get this already (assuming SEW_LMUL1MAX = 64) by setting LMUL=2, but it seems like a poor tradeoff that
software should halve the number of registers/groups, just so that hardware could theoretically have single-element vectors.

Can we mandate VLEN >= 2*SEW_LMUL1MAX, perhaps in a profile? That would help software :)

BTW, are we intending to have the same binaries work on different implementations? It seems the only way to discover SEW_LMUL1MAX
is to try various SEW/LMUL and check for vill. Because LMUL is baked into the intrinsic function name,
software that wants portable binaries would have to recompile all vector code for LMUL=1,2,4,8, and then
pick the first one that works.

That's very burdensome, a profile guaranteeing SEW_LMUL1MAX = 64 or at least LMUL2MAX = 64 would also help a lot.

On Fri, Jul 9, 2021 at 6:58 AM Krste Asanovic <krste@...> wrote:
We’ll meet tomorrow to see if there are any remaining concerns before going Into public review,
Krste







Jan Wassenberg
 

A topic to discuss: lower bound on VLEN.

The upper bound is helpful but even VL-agnostic code sometimes wants at least 128 bits.
Example: N parallel instances of AES (16 bytes each), or N 128-bit results from 64x64 normal or carryless multiplication.

We can get this already (assuming SEW_LMUL1MAX = 64) by setting LMUL=2, but it seems like a poor tradeoff that
software should halve the number of registers/groups, just so that hardware could theoretically have single-element vectors.

Can we mandate VLEN >= 2*SEW_LMUL1MAX, perhaps in a profile? That would help software :)

BTW, are we intending to have the same binaries work on different implementations? It seems the only way to discover SEW_LMUL1MAX
is to try various SEW/LMUL and check for vill. Because LMUL is baked into the intrinsic function name,
software that wants portable binaries would have to recompile all vector code for LMUL=1,2,4,8, and then
pick the first one that works.

That's very burdensome, a profile guaranteeing SEW_LMUL1MAX = 64 or at least LMUL2MAX = 64 would also help a lot.

On Fri, Jul 9, 2021 at 6:58 AM Krste Asanovic <krste@...> wrote:
We’ll meet tomorrow to see if there are any remaining concerns before going Into public review,
Krste







Krste Asanovic
 

We’ll meet tomorrow to see if there are any remaining concerns before going Into public review,
Krste