RISC-V Vector Extension post-public review updates


Krste Asanovic
 

Apart from requests for more instructions, which can be handled with
later extensions, there were no real substantive updates.

I did notice one issue at end of public review, however.

The current specification allows some instructions to have two vector
source operands read from the same vector register but with different
EEW. For example, a vector indexed store with the index vector and
data vector overlapping, but different EEW. Or a widening vector add
(vwadd.wv) where the two vector sources overlap but have different
EEW. This complicates implementations that internally restripe the
vector data (e.g., internal SLEN), and does not have a valid software
use (cue folks furiously trying to construct one...).

The proposal is to allow implementations to raise an illegal
instruction exception in this case. I believe this is an important
and necessary change to accomodate internal striping. In practice,
this change has no impact on software.

We do have a choice of:

1) Mandate all implementations raise an illegal exception in this
case. This is my preferred route, as this would be a minor errata for
existing implementations (doesn't affect software), and we would not reuse
this state/encoding for other purposes.

2) Allow either correct execution or illegal exception (as with
misaligned).

3) Consider "reserved", implying implementations that support it are
non-conforming unless we later go with 2).

I'm assuming we're going to push to ratify 1) unless I hear strong objections.

Krste


Guy Lemieux
 

That’s great news, thanks Krste!

The current specification allows some instructions to have two vector
source operands read from the same vector register but with different
EEW. 

(cue folks furiously trying to construct one...)

haha i tried but can’t think of a use case !!

We do have a choice of:

1) Mandate all implementations raise an illegal exception in this
case.  This is my preferred route, as this would be a minor errata for
existing implementations (doesn't affect software), and we would not reuse
this state/encoding for other purposes.

2) Allow either correct execution or illegal exception (as with
misaligned). 

3) Consider "reserved", implying implementations that support it are
non-conforming unless we later go with 2).

I don’t have a strong opinion, but I prefer a route that allows us to recover those instruction encodings — they seem to be getting scarce hence represent value. You said there were already requests for extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

Does (3) give us the best route to reuse the encodings in the future? I’m a bit confused about the permanence of (1), and I don’t like the possibility software fragmentation that will arise from (2).

Guy


Krste Asanovic
 

On Mon, 15 Nov 2021 13:29:58 -0800, Guy Lemieux <guy.lemieux@...> said:
| We do have a choice of:

| 1) Mandate all implementations raise an illegal exception in this
| case.  This is my preferred route, as this would be a minor errata for
| existing implementations (doesn't affect software), and we would not reuse
| this state/encoding for other purposes.

| 2) Allow either correct execution or illegal exception (as with
| misaligned). 

| 3) Consider "reserved", implying implementations that support it are
| non-conforming unless we later go with 2).

| I don’t have a strong opinion, but I prefer a route that allows us to recover those instruction encodings — they seem to be getting scarce hence represent
| value. You said there were already requests for extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

| Does (3) give us the best route to reuse the encodings in the future? I’m a bit confused about the permanence of (1), and I don’t like the possibility
| software fragmentation that will arise from (2).

| Guy

The problem with this encoding space is that in some cases (i.e.,
vector indexed stores) the reserved encoding is actually only for some
combination of vtype control state + instruction bits, or in the
others it is low-quality space, e.g., where the source vector register
specifiers overlap.

So, I would assume 1) would be permanent.

As a general comment, folks seem to overweight the long-term value of
awkward corners of the encoding space, versus the short/medium-term
benefits of having clean and simple encoding. More mature
architectures tend to add cleaner extended encodings (prefix
bytes/words) rather than ramming substantial additions into awkward
unused corners.

Krste


Bill Huffman
 

I'm glad this came up. I certainly wouldn't want to try to make an implementation work for these cases. 😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved." Possibly that's because they might make sense someday and this won't.

But I'm OK with #1 or #3.

Bill

-----Original Message-----
From: tech-vector-ext@... <tech-vector-ext@...> On Behalf Of Krste Asanovic
Sent: Monday, November 15, 2021 4:58 PM
To: Guy Lemieux <guy.lemieux@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

EXTERNAL MAIL



On Mon, 15 Nov 2021 13:29:58 -0800, Guy Lemieux <guy.lemieux@...> said:
| We do have a choice of:

| 1) Mandate all implementations raise an illegal exception in this
| case.  This is my preferred route, as this would be a minor errata for
| existing implementations (doesn't affect software), and we would not reuse
| this state/encoding for other purposes.

| 2) Allow either correct execution or illegal exception (as with
| misaligned).

| 3) Consider "reserved", implying implementations that support it are
| non-conforming unless we later go with 2).

| I don’t have a strong opinion, but I prefer a route that allows us to
| recover those instruction encodings — they seem to be getting scarce hence represent value. You said there were already requests for extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

| Does (3) give us the best route to reuse the encodings in the future?
| I’m a bit confused about the permanence of (1), and I don’t like the possibility software fragmentation that will arise from (2).

| Guy

The problem with this encoding space is that in some cases (i.e., vector indexed stores) the reserved encoding is actually only for some combination of vtype control state + instruction bits, or in the others it is low-quality space, e.g., where the source vector register specifiers overlap.

So, I would assume 1) would be permanent.

As a general comment, folks seem to overweight the long-term value of awkward corners of the encoding space, versus the short/medium-term benefits of having clean and simple encoding. More mature architectures tend to add cleaner extended encodings (prefix
bytes/words) rather than ramming substantial additions into awkward unused corners.

Krste


Greg Favor
 

On Mon, Nov 15, 2021 at 1:05 PM Krste Asanovic <krste@...> wrote:
1) Mandate all implementations raise an illegal exception in this
case.  This is my preferred route, as this would be a minor errata for
existing implementations (doesn't affect software), and we would not reuse
this state/encoding for other purposes.

2) Allow either correct execution or illegal exception (as with
misaligned). 

3) Consider "reserved", implying implementations that support it are
non-conforming unless we later go with 2).

I'm assuming we're going to push to ratify 1) unless I hear strong objections.

 If I understand correctly, #3 is like #1 near-term (for conforming implementations), but keeps the possibility open to later changing from "always illegal exception" to something else.

I'd vote for #1 or #3, and would be perfectly fine with #1.

Greg


andrew@...
 



On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.


But I'm OK with #1 or #3.

I'd personally prefer #3 due to laziness, but I don't have a technical objection to #1.


      Bill

-----Original Message-----
From: tech-vector-ext@... <tech-vector-ext@...> On Behalf Of Krste Asanovic
Sent: Monday, November 15, 2021 4:58 PM
To: Guy Lemieux <guy.lemieux@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

EXTERNAL MAIL



>>>>> On Mon, 15 Nov 2021 13:29:58 -0800, Guy Lemieux <guy.lemieux@...> said:
|     We do have a choice of:

|     1) Mandate all implementations raise an illegal exception in this
|     case.  This is my preferred route, as this would be a minor errata for
|     existing implementations (doesn't affect software), and we would not reuse
|     this state/encoding for other purposes.

|     2) Allow either correct execution or illegal exception (as with
|     misaligned).

|     3) Consider "reserved", implying implementations that support it are
|     non-conforming unless we later go with 2).

| I don’t have a strong opinion, but I prefer a route that allows us to
| recover those instruction encodings — they seem to be getting scarce hence represent value. You said there were already requests for extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

| Does (3) give us the best route to reuse the encodings in the future?
| I’m a bit confused about the permanence of (1), and I don’t like the possibility software fragmentation that will arise from (2).

| Guy

The problem with this encoding space is that in some cases (i.e., vector indexed stores) the reserved encoding is actually only for some combination of vtype control state + instruction bits, or in the others it is low-quality space, e.g., where the source vector register specifiers overlap.

So, I would assume 1) would be permanent.

As a general comment, folks seem to overweight the long-term value of awkward corners of the encoding space, versus the short/medium-term benefits of having clean and simple encoding.  More mature architectures tend to add cleaner extended encodings (prefix
bytes/words) rather than ramming substantial additions into awkward unused corners.

Krste











Krste Asanovic
 

I'm OK with going with #3 - it provides most flexibility to deal with
in future, even if we just go with illegal instruction exception, and
avoids adding tests to the compatibility suite.

Krste

On Mon, 15 Nov 2021 14:31:45 -0800, Andrew Waterman <andrew@...> said:
| On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
| I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

| I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode
| space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

| I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype
| setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

| But I'm OK with #1 or #3.

| I'd personally prefer #3 due to laziness, but I don't have a technical objection to #1.

|       Bill

| -----Original Message-----
| From: tech-vector-ext@... <tech-vector-ext@...> On Behalf Of Krste Asanovic
| Sent: Monday, November 15, 2021 4:58 PM
| To: Guy Lemieux <guy.lemieux@...>
| Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
| Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

| EXTERNAL MAIL

|||||| On Mon, 15 Nov 2021 13:29:58 -0800, Guy Lemieux <guy.lemieux@...> said:
| |     We do have a choice of:

| |     1) Mandate all implementations raise an illegal exception in this
| |     case.  This is my preferred route, as this would be a minor errata for
| |     existing implementations (doesn't affect software), and we would not reuse
| |     this state/encoding for other purposes.

| |     2) Allow either correct execution or illegal exception (as with
| |     misaligned).

| |     3) Consider "reserved", implying implementations that support it are
| |     non-conforming unless we later go with 2).

| | I don’t have a strong opinion, but I prefer a route that allows us to
| | recover those instruction encodings — they seem to be getting scarce hence represent value. You said there were already requests for
| extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

| | Does (3) give us the best route to reuse the encodings in the future?
| | I’m a bit confused about the permanence of (1), and I don’t like the possibility software fragmentation that will arise from (2).

| | Guy

| The problem with this encoding space is that in some cases (i.e., vector indexed stores) the reserved encoding is actually only for some
| combination of vtype control state + instruction bits, or in the others it is low-quality space, e.g., where the source vector register
| specifiers overlap.

| So, I would assume 1) would be permanent.

| As a general comment, folks seem to overweight the long-term value of awkward corners of the encoding space, versus the short/medium-term
| benefits of having clean and simple encoding.  More mature architectures tend to add cleaner extended encodings (prefix
| bytes/words) rather than ramming substantial additions into awkward unused corners.

| Krste

|


Bill Huffman
 

 

 

From: Andrew Waterman <andrew@...>
Sent: Monday, November 15, 2021 5:32 PM
To: Bill Huffman <huffman@...>
Cc: Krste Asanovic <krste@...>; Guy Lemieux <guy.lemieux@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

 

EXTERNAL MAIL

 

 

On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:

I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

 

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

 

Other nooks and crannies are also dependent on vtype.  For example, widening instructions are not valid for LMUL large enough to make the wider operands cover more than 8 registers.

 

      Bill

 


But I'm OK with #1 or #3.

 

I'd personally prefer #3 due to laziness, but I don't have a technical objection to #1.

 


      Bill

-----Original Message-----
From: tech-vector-ext@... <tech-vector-ext@...> On Behalf Of Krste Asanovic
Sent: Monday, November 15, 2021 4:58 PM
To: Guy Lemieux <guy.lemieux@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

EXTERNAL MAIL



>>>>> On Mon, 15 Nov 2021 13:29:58 -0800, Guy Lemieux <guy.lemieux@...> said:
|     We do have a choice of:

|     1) Mandate all implementations raise an illegal exception in this
|     case.  This is my preferred route, as this would be a minor errata for
|     existing implementations (doesn't affect software), and we would not reuse
|     this state/encoding for other purposes.

|     2) Allow either correct execution or illegal exception (as with
|     misaligned).

|     3) Consider "reserved", implying implementations that support it are
|     non-conforming unless we later go with 2).

| I don’t have a strong opinion, but I prefer a route that allows us to
| recover those instruction encodings — they seem to be getting scarce hence represent value. You said there were already requests for extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

| Does (3) give us the best route to reuse the encodings in the future?
| I’m a bit confused about the permanence of (1), and I don’t like the possibility software fragmentation that will arise from (2).

| Guy

The problem with this encoding space is that in some cases (i.e., vector indexed stores) the reserved encoding is actually only for some combination of vtype control state + instruction bits, or in the others it is low-quality space, e.g., where the source vector register specifiers overlap.

So, I would assume 1) would be permanent.

As a general comment, folks seem to overweight the long-term value of awkward corners of the encoding space, versus the short/medium-term benefits of having clean and simple encoding.  More mature architectures tend to add cleaner extended encodings (prefix
bytes/words) rather than ramming substantial additions into awkward unused corners.

Krste










andrew@...
 



On Mon, Nov 15, 2021 at 2:56 PM Bill Huffman <huffman@...> wrote:

 

 

From: Andrew Waterman <andrew@...>
Sent: Monday, November 15, 2021 5:32 PM
To: Bill Huffman <huffman@...>
Cc: Krste Asanovic <krste@...>; Guy Lemieux <guy.lemieux@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

 

EXTERNAL MAIL

 

 

On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:

I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

 

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

 

Other nooks and crannies are also dependent on vtype.  For example, widening instructions are not valid for LMUL large enough to make the wider operands cover more than 8 registers.


Valid point, I was thinking of the scalar encoding spaces.

 

      Bill

 


But I'm OK with #1 or #3.

 

I'd personally prefer #3 due to laziness, but I don't have a technical objection to #1.

 


      Bill

-----Original Message-----
From: tech-vector-ext@... <tech-vector-ext@...> On Behalf Of Krste Asanovic
Sent: Monday, November 15, 2021 4:58 PM
To: Guy Lemieux <guy.lemieux@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

EXTERNAL MAIL



>>>>> On Mon, 15 Nov 2021 13:29:58 -0800, Guy Lemieux <guy.lemieux@...> said:
|     We do have a choice of:

|     1) Mandate all implementations raise an illegal exception in this
|     case.  This is my preferred route, as this would be a minor errata for
|     existing implementations (doesn't affect software), and we would not reuse
|     this state/encoding for other purposes.

|     2) Allow either correct execution or illegal exception (as with
|     misaligned).

|     3) Consider "reserved", implying implementations that support it are
|     non-conforming unless we later go with 2).

| I don’t have a strong opinion, but I prefer a route that allows us to
| recover those instruction encodings — they seem to be getting scarce hence represent value. You said there were already requests for extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

| Does (3) give us the best route to reuse the encodings in the future?
| I’m a bit confused about the permanence of (1), and I don’t like the possibility software fragmentation that will arise from (2).

| Guy

The problem with this encoding space is that in some cases (i.e., vector indexed stores) the reserved encoding is actually only for some combination of vtype control state + instruction bits, or in the others it is low-quality space, e.g., where the source vector register specifiers overlap.

So, I would assume 1) would be permanent.

As a general comment, folks seem to overweight the long-term value of awkward corners of the encoding space, versus the short/medium-term benefits of having clean and simple encoding.  More mature architectures tend to add cleaner extended encodings (prefix
bytes/words) rather than ramming substantial additions into awkward unused corners.

Krste










Guy Lemieux
 

On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

Ok, this makes the opcodes virtually useless for other instructions.

Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case it would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

Guy





Krste Asanovic
 



On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

Ok, this makes the opcodes virtually useless for other instructions.

Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case it would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

Guy

There’s no benefit to setting vill versus just taking a trap in this case.

Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery mechanism.

Krste



Craig Topper
 

On Nov 15, 2021, at 3:24 PM, Krste Asanovic <krste@...> wrote:



On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

Ok, this makes the opcodes virtually useless for other instructions.

Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case it would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

Guy

There’s no benefit to setting vill versus just taking a trap in this case.

Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery mechanism.

Krste

Is it possible to generate one of these cases from C with crazy uses of vreinterpret and vget/vset intrinsics? What should the compiler do for such code?

Craig





andrew@...
 



On Mon, Nov 15, 2021 at 3:28 PM Craig Topper <craig.topper@...> wrote:
On Nov 15, 2021, at 3:24 PM, Krste Asanovic <krste@...> wrote:



On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

Ok, this makes the opcodes virtually useless for other instructions.

Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case it would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

Guy

There’s no benefit to setting vill versus just taking a trap in this case.

Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery mechanism.

Krste

Is it possible to generate one of these cases from C with crazy uses of vreinterpret and vget/vset intrinsics? What should the compiler do for such code?

Surely possible; the fix would be to explicitly move the smaller operand to a non-overlapping register.


Craig





Krste Asanovic
 

I went with #3 in the updated text - this is the smallest delta from the frozen spec.

Note that the worst case was something like:

vfwmacc.vv v0, v0, v0, v0.t

with SEW=32b, v0 is read/written as a 64b float, read as a 32b float, and read as a vector of single-bit masks in the same instruction.

Krste

On Nov 15, 2021, at 2:55 PM, Krste Asanovic via lists.riscv.org <krste=sifive.com@...> wrote:


I'm OK with going with #3 - it provides most flexibility to deal with
in future, even if we just go with illegal instruction exception, and
avoids adding tests to the compatibility suite.

Krste

On Mon, 15 Nov 2021 14:31:45 -0800, Andrew Waterman <andrew@...> said:
| On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
| I'm glad this came up. I certainly wouldn't want to try to make an implementation work for these cases. 😊

| I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode
| space that don't make sense to implement "reserved." Possibly that's because they might make sense someday and this won't.

| I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype
| setting. So we can rationalize the departure from the normal practice of marking the state reserved.

| But I'm OK with #1 or #3.

| I'd personally prefer #3 due to laziness, but I don't have a technical objection to #1.

| Bill

| -----Original Message-----
| From: tech-vector-ext@... <tech-vector-ext@...> On Behalf Of Krste Asanovic
| Sent: Monday, November 15, 2021 4:58 PM
| To: Guy Lemieux <guy.lemieux@...>
| Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
| Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates

| EXTERNAL MAIL

|||||| On Mon, 15 Nov 2021 13:29:58 -0800, Guy Lemieux <guy.lemieux@...> said:
| | We do have a choice of:

| | 1) Mandate all implementations raise an illegal exception in this
| | case. This is my preferred route, as this would be a minor errata for
| | existing implementations (doesn't affect software), and we would not reuse
| | this state/encoding for other purposes.

| | 2) Allow either correct execution or illegal exception (as with
| | misaligned).

| | 3) Consider "reserved", implying implementations that support it are
| | non-conforming unless we later go with 2).

| | I don’t have a strong opinion, but I prefer a route that allows us to
| | recover those instruction encodings — they seem to be getting scarce hence represent value. You said there were already requests for
| extra instructions — would this space be usef for any of them (or other as-yet-unforeeen instructions)?

| | Does (3) give us the best route to reuse the encodings in the future?
| | I’m a bit confused about the permanence of (1), and I don’t like the possibility software fragmentation that will arise from (2).

| | Guy

| The problem with this encoding space is that in some cases (i.e., vector indexed stores) the reserved encoding is actually only for some
| combination of vtype control state + instruction bits, or in the others it is low-quality space, e.g., where the source vector register
| specifiers overlap.

| So, I would assume 1) would be permanent.

| As a general comment, folks seem to overweight the long-term value of awkward corners of the encoding space, versus the short/medium-term
| benefits of having clean and simple encoding. More mature architectures tend to add cleaner extended encodings (prefix
| bytes/words) rather than ramming substantial additions into awkward unused corners.

| Krste

|





Krste Asanovic
 

I'm not sure if C intrinsics can generate this case, but there are
other cases where dynamic value settings can result in illegal
instruction traps, so the result would be the same that
implementations will either trap or do something non-conforming.

Krste

On Mon, 15 Nov 2021 15:28:25 -0800, Craig Topper <craig.topper@...> said:
| On Nov 15, 2021, at 3:24 PM, Krste Asanovic <krste@...> wrote:
| On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

| On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:

| I'm glad this came up. I certainly wouldn't want to try to make an implementation work for these cases. 😊

| I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar
| corners of opcode space that don't make sense to implement "reserved." Possibly that's because they might make sense someday
| and this won't.

| I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the
| dynamic vtype setting. So we can rationalize the departure from the normal practice of marking the state reserved.

| Ok, this makes the opcodes virtually useless for other instructions.

| Instead, shouldn't we be setting a bit similar to vill? I realize vill is only set on illegal vset* instructions; in this case it
| would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

| Guy

| There’s no benefit to setting vill versus just taking a trap in this case.

| Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery mechanism.

| Krste

| Is it possible to generate one of these cases from C with crazy uses of vreinterpret and vget/vset intrinsics? What should the compiler do for
| such code?

| Craig

|


Nick Knight
 

On Mon, Nov 15, 2021 at 3:40 PM Krste Asanovic <krste@...> wrote:

I'm not sure if C intrinsics can generate this case,

 
but there are
other cases where dynamic value settings can result in illegal
instruction traps, so the result would be the same that
implementations will either trap or do something non-conforming.

Krste

>>>>> On Mon, 15 Nov 2021 15:28:25 -0800, Craig Topper <craig.topper@...> said:

| On Nov 15, 2021, at 3:24 PM, Krste Asanovic <krste@...> wrote:
|         On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

|             On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:

|                 I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

|                 I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar
|                 corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday
|                 and this won't.

|             I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the
|             dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

|         Ok, this makes the opcodes virtually useless for other instructions.

|         Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case it
|         would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

|         Guy

|     There’s no benefit to setting vill versus just taking a trap in this case.

|     Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery mechanism.

|     Krste

| Is it possible to generate one of these cases from C with crazy uses of vreinterpret and vget/vset intrinsics? What should the compiler do for
| such code?

| Craig

|     






Krste Asanovic
 

I guess simpler examples are anytime you use v0 as mask and a data
source.

These aren't useful use-cases, so existing software shouldn't have
been doing this (except test code).

Krste

On Mon, 15 Nov 2021 15:45:14 -0800, Nick Knight <nick.knight@...> said:
| On Mon, Nov 15, 2021 at 3:40 PM Krste Asanovic <krste@...> wrote:
| I'm not sure if C intrinsics can generate this case,

| https://godbolt.org/z/qj6WzYc76
|  

| but there are
| other cases where dynamic value settings can result in illegal
| instruction traps, so the result would be the same that
| implementations will either trap or do something non-conforming.

| Krste

|||||| On Mon, 15 Nov 2021 15:28:25 -0800, Craig Topper <craig.topper@...> said:

| | On Nov 15, 2021, at 3:24 PM, Krste Asanovic <krste@...> wrote:
| |         On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

| |             On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:

| |                 I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

| |                 I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar
| |                 corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense
| someday
| |                 and this won't.

| |             I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of
| the
| |             dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

| |         Ok, this makes the opcodes virtually useless for other instructions.

| |         Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case
| it
| |         would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype
| ?

| |         Guy

| |     There’s no benefit to setting vill versus just taking a trap in this case.

| |     Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery
| mechanism.

| |     Krste

| | Is it possible to generate one of these cases from C with crazy uses of vreinterpret and vget/vset intrinsics? What should the compiler
| do for
| | such code?

| | Craig

| |     

|


Guy Lemieux
 


To determine the trap cause, without such a bit, software will have to examine many possible vtype settings that are unique for each particular instruction. The trap handler will be highly customized for each cpu implementation.

This could be done more easily in a handful of logic gates, without a vastly different flow in the trap handler (which will already know to check vill).

Guy


On Mon, Nov 15, 2021 at 3:24 PM Krste Asanovic <krste@...> wrote:


On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:
I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday and this won't.

I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

Ok, this makes the opcodes virtually useless for other instructions.

Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case it would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

Guy

There’s no benefit to setting vill versus just taking a trap in this case.

Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery mechanism.

Krste



Krste Asanovic
 

In this case, the trap cause can be determined by looking at the vtype
value and the instruction encoding (most only need to look at
instruction encoding), independent of implementation. No vtype
probing is needed.

(assuming there isn't some non-conforming use of the encoding, which
is out-of-scope for any discussion of standard trap handlers)

Krste

On Mon, 15 Nov 2021 15:49:03 -0800, Guy Lemieux <guy.lemieux@...> said:
| To determine the trap cause, without such a bit, software will have to examine many possible vtype settings that are unique for each particular
| instruction. The trap handler will be highly customized for each cpu implementation.

| This could be done more easily in a handful of logic gates, without a vastly different flow in the trap handler (which will already know to
| check vill).

| Guy

| On Mon, Nov 15, 2021 at 3:24 PM Krste Asanovic <krste@...> wrote:

| On Nov 15, 2021, at 3:13 PM, Guy Lemieux <guy.lemieux@...> wrote:

| On Mon, Nov 15, 2021 at 2:17 PM Bill Huffman <huffman@...> wrote:

| I'm glad this came up.  I certainly wouldn't want to try to make an implementation work for these cases.  😊

| I lean a bit toward #3, not so much because we might use the space as because I think we've called all the other similar
| corners of opcode space that don't make sense to implement "reserved."  Possibly that's because they might make sense someday
| and this won't.

| I think these encodings are qualitatively different from other nooks and crannies, since their availability is a function of the
| dynamic vtype setting.  So we can rationalize the departure from the normal practice of marking the state reserved.

| Ok, this makes the opcodes virtually useless for other instructions.

| Instead, shouldn't we be setting a bit similar to vill?  I realize vill is only set on illegal vset* instructions; in this case it
| would be a new bit which is only set on executing instructions that are incompatible with the current (but otherwise valid) vtype ?

| Guy

| There’s no benefit to setting vill versus just taking a trap in this case.

| Vill is there so we don’t have to add the first trap on a write of a particular data value, and also to provide a discovery mechanism.

| Krste


Victor Moya
 


From an ISA definition point of view it doesn't make sense to forbid properly formed operations to benefit a specific hardware implementation. It's an ugly hack.

If a given hardware implementation can't handle it in an optimal way and it really doesn't have real software use (ie, performance is irrelevant) it can just trigger a slow path (microcode sequence or trap to software emulation).

But given that it isn't the first case in the spec it isn't really much of a problem. Between making halfway hacks, I think it's better to make it completely illegal (option #1) than to add additional fragmentation that may affect compilers with #2 or #3.

If the vector specification is required to be optimal for a specific hardware implementation better make it explicitly so and not go in roundabout ways..

Victor


On Mon, Nov 15, 2021 at 10:05 PM Krste Asanovic <krste@...> wrote:

Apart from requests for more instructions, which can be handled with
later extensions, there were no real substantive updates.

I did notice one issue at end of public review, however.

The current specification allows some instructions to have two vector
source operands read from the same vector register but with different
EEW.  For example, a vector indexed store with the index vector and
data vector overlapping, but different EEW.  Or a widening vector add
(vwadd.wv) where the two vector sources overlap but have different
EEW.  This complicates implementations that internally restripe the
vector data (e.g., internal SLEN), and does not have a valid software
use (cue folks furiously trying to construct one...).

The proposal is to allow implementations to raise an illegal
instruction exception in this case.  I believe this is an important
and necessary change to accomodate internal striping.  In practice,
this change has no impact on software.

We do have a choice of:

1) Mandate all implementations raise an illegal exception in this
case.  This is my preferred route, as this would be a minor errata for
existing implementations (doesn't affect software), and we would not reuse
this state/encoding for other purposes.

2) Allow either correct execution or illegal exception (as with
misaligned). 

3) Consider "reserved", implying implementations that support it are
non-conforming unless we later go with 2).

I'm assuming we're going to push to ratify 1) unless I hear strong objections.

Krste