Minutes of 2020/3/6 vector task group meeting


Krste Asanovic
 

Date: 2020/3/6
Task Group: Vector Extension
Chair: Krste Asanovic
Number of Attendees: ~21
Current issues on github: https://github.com/riscv/riscv-v-spec

Note, the Zoom meeting details have changed. Please view the member
calendar entry for updated details.

Issues discussed: #362,#354,#382,

The following issues were discussed.

#362,#354,#382 Fractional LMUL additional registers

There was considerable discussion trying to understand the design of
the scheme to pack multiple fractional LMUL registers into the base
vector registers. The conclusion was that we first had to understand
how fractional LMUL would map into the base vector registers,
including interaction with SLEN, before attempting to add fractional
LMUL register packing as another optimization with another level of
complexity.

Waiting on receiving the fractional LMUL mapping.

#367 Tail Agnostic

The discussion reviewed the proposal that long temporal vector
registers with renaming could be handled using vector length trimming.

The proposal was then added that masking should also be given option
of being agnostic giving three options:
1) tail-undisturbed + masking-undisturbed
2) tail-agnostic + masking-undisturbed
3) tail-agnostic + masking-agnostic

All implementations would have to support all options.

Simple implementations can ignore setting and treat all as 1).

Option 2) could be implemented as option 1) even if option 3) was
supported.

The recommendation was that agnostic destination elements had to only
be either zero or undisturbed.

Discussion to continue with this suggestion.


Bill Huffman
 

On 3/6/20 6:22 PM, Krste Asanovic wrote:
EXTERNAL MAIL



Date: 2020/3/6
Task Group: Vector Extension
Chair: Krste Asanovic
Number of Attendees: ~21
Current issues on github: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_riscv_riscv-2Dv-2Dspec&d=DwIBAg&c=aUq983L2pue2FqKFoP6PGHMJQyoJ7kl3s3GZ-_haXqY&r=AYJ4kbebphYpRw2lYDUDCk5w5Qa3-DR3bQnFjLVmM80&m=7iUw4AQUNARUmLrctcmUnaVeTQFEsr3iPkKAmbvN7TQ&s=SnwnLsCrm8ukZcK2uhKQB4CLhuFujbyBzFx1vgL4iQA&e=
...

#367 Tail Agnostic

The discussion reviewed the proposal that long temporal vector
registers with renaming could be handled using vector length trimming.

The proposal was then added that masking should also be given option
of being agnostic giving three options:
1) tail-undisturbed + masking-undisturbed
2) tail-agnostic + masking-undisturbed
3) tail-agnostic + masking-agnostic
We've discussed before, but allowing the *-agnostic options means code
can be written and tested on an implementation that supports them and
then fail on an implementation that maps them to #1. And vice-versa.

Bill


All implementations would have to support all options.

Simple implementations can ignore setting and treat all as 1).

Option 2) could be implemented as option 1) even if option 3) was
supported.

The recommendation was that agnostic destination elements had to only
be either zero or undisturbed.

Discussion to continue with this suggestion.



Krste Asanovic
 

On Sat, 7 Mar 2020 02:32:24 +0000, Bill Huffman <huffman@...> said:
| On 3/6/20 6:22 PM, Krste Asanovic wrote:
[...]
|| #367 Tail Agnostic
|| The discussion reviewed the proposal that long temporal vector
|| registers with renaming could be handled using vector length trimming.
||
|| The proposal was then added that masking should also be given option
|| of being agnostic giving three options:
|| 1) tail-undisturbed + masking-undisturbed
|| 2) tail-agnostic + masking-undisturbed
|| 3) tail-agnostic + masking-agnostic

| We've discussed before, but allowing the *-agnostic options means code
| can be written and tested on an implementation that supports them and
| then fail on an implementation that maps them to #1. And vice-versa.

Yes, this is the disadvantage of having implementation-dependent
options. Along with making portable
functional/verification/compliance models more complex (though this
case is not as bad as some others).

This has to be balanced against the advantage of greater efficiency on
a wider range of microarchitectures.

Unordered vector reductions and vector stores already cause issues of
this kind, as does the memory model for multithreaded code, but
obviously want to reduce the number of cases where this shows up.

Krste