Hi all, For common AI workloads such as DNNs, data communications between network layers introduce huge pressure on capacity and bandwidth of the memory hierarchy.
For instance, dynamic large activation or feature map data needs to be buffered and communicated across multiple layers, which often appears to be sparse (e.g. ReLU). People use bit vectors to "compress" the data buffered and "decompress" for the following layer computations.
Here we can see from the spec that "vcompress" has already been included, how about "vdecompress"?
Thanks, Dawei
|
|
Re: EEW and non-indexed loads/stores

Krste Asanovic
Correct, Krste
toggle quoted message
Show quoted text
On Sep 2, 2020, at 11:10 PM, Roger Ferrer Ibanez <roger.ferrer@...> wrote:
Hi all,
I understand the EEW, as explicitly encoded in the load/store instructions applies to the vector of indices for the indexed loads and stores. For instance we can load a vector "SEW=8,LMUL=1" using a vector of indices of "SEW=64,LMUL=8" by making sure vtype has "SEW=8,LMUL=1" and using v{l,s}xei64.
I'd like to confirm I'm understanding correctly the EEW for unit-stride and strided loads and stores.
Say that vtype is such that SEW=16,LMUL=1 and we execute a v{l,s}{,s}e32.v. Now the EEW of the data and address operands is EEW=32 (as encoded in the instruction) so EMUL=(EEW/SEW)*LMUL=(32/16)*1=2. So in this case we're loading/storing a vector SEW=32,LMUL=2.
Is my interpretation correct?
If it is, I assume this is useful in sequences such as the following one
# SEW=16,LMUL=1 vle16.v v1, (t0) # Load a vector of sew=16,lmul=1 vle32.v v2, (t1) # Load a vector of sew=32,lmul=2, cool, no need to change vtype vwadd.wv v4, v2, v1 # v4_v5(32)[:] ← v2_v3(32)[:] + sign-extend(v1(16)[:]) vse32.v v4, (t1) # Store a vector of sew=32,lmul=2, no need to change vtype either
Thank you,
-- Roger Ferrer Ibáñez - roger.ferrer@... Barcelona Supercomputing Center - Centro Nacional de Supercomputación
http://bsc.es/disclaimer
|
|
EEW and non-indexed loads/stores

Roger Ferrer Ibanez
Hi all, I understand the EEW, as explicitly encoded in the load/store instructions applies to the vector of indices for the indexed loads and stores. For instance we can load a vector "SEW=8,LMUL=1" using a vector of indices of "SEW=64,LMUL=8" by making sure vtype has "SEW=8,LMUL=1" and using v{l,s}xei64. I'd like to confirm I'm understanding correctly the EEW for unit-stride and strided loads and stores. Say that vtype is such that SEW=16,LMUL=1 and we execute a v{l,s}{,s}e32.v. Now the EEW of the data and address operands is EEW=32 (as encoded in the instruction) so EMUL=(EEW/SEW)*LMUL=(32/16)*1=2. So in this case we're loading/storing a vector SEW=32,LMUL=2. Is my interpretation correct? If it is, I assume this is useful in sequences such as the following one # SEW=16,LMUL=1 vle16.v v1, (t0) # Load a vector of sew=16,lmul=1 vle32.v v2, (t1) # Load a vector of sew=32,lmul=2, cool, no need to change vtype vwadd.wv v4, v2, v1 # v4_v5(32)[:] ← v2_v3(32)[:] + sign-extend(v1(16)[:]) vse32.v v4, (t1) # Store a vector of sew=32,lmul=2, no need to change vtype either Thank you, -- Roger Ferrer Ibáñez - roger.ferrer@... Barcelona Supercomputing Center - Centro Nacional de Supercomputación http://bsc.es/disclaimer
|
|
Re: Signed v Unsigned Immediate: vsaddu.vi

Nick Knight
Hi Cohen,
Thanks for your careful reading.
Best, Nick Knight
toggle quoted message
Show quoted text
On Wed, Sep 2, 2020 at 2:44 PM Andrew Waterman < andrew@...> wrote: The non-normative text you quoted should be edited to delete the words “it is signed”.
The immediate is sign-extended, but then is treated as an unsigned value. So the operation doesn’t differ based on the argument type.
(This sign-extended-but-unsigned-immediate pattern is also exists for e.g. sltiu in the base ISA and vmsgtu.vi in the vector extension.)
From chapter 11, section 1 (#3):
The 5-bit immediate is unsigned when either providing a register index in vrgather or a count for shift, clip, or slide. In all other cases
it is signed and sign extended to SEW bits, even for bitwise and unsigned instructions, notably compare and add.
From chapter 13, section 1: Saturating forms of integer add and subtract are provided, for both signed and unsigned integers. If the result would overflow the destination, the result is replaced with the closest representable value, and the vxsat bit is set.
This results in a conundrum: operation SEW RS1 RS2 vsaddu.vv 8 0x0ff 0x01 vsaddu.vi 8 0x01f 0x01
These two operations now provide a difference of result. Taking the maximum unsigned integer value, adding one, causes saturation. The result value for the vector-vector operation would be 0xff and the VXSAT bit would be set. This shouldn't be a surprise. However, the immediate form is more difficult. The immediate value is sign-extended to SEW size and treated as a signed value. This means the arithmetic is now (-1) + 1 = 0. This does not create a saturation (a value outside expected return parameters). The result value from the vector-immediate operation would be 0x1f and the VXSAT bit would be clear.
This is from the specification, as written, in a strict sense.
From a use-case sense, what is trying to be accomplished, here? Two counter perspectives: 1 - from a use-case perspective, why would a programmer or compiler specifically pick an unsigned operation, only to operate on values using a signed-immediate in a signed format? I'm curious that this case is. 2 - from an architecture/implementation perspective, this is the first time that an engine will have to operate on an instruction differently based on the *source* of the operand. That is, more narrowly, the arithmetic engines are given an operation encoding (usually an "onto" mapping from the opcode space) and operands, but does not care where the operations came from. In other words, the vector engine itself would receive a full bit set in RS1 for both cases, above, for a saturating unsigned (sorta) add. However, the outcome is required to be different?
I would imagine others have run into this situation, and I'd like to know both the intent of having a signed-immediate value for this unsigned operation, as well as the applicability of section 11.1 to this instruction.
|
|
Re: Signed v Unsigned Immediate: vsaddu.vi
The non-normative text you quoted should be edited to delete the words “it is signed”.
The immediate is sign-extended, but then is treated as an unsigned value. So the operation doesn’t differ based on the argument type.
(This sign-extended-but-unsigned-immediate pattern is also exists for e.g. sltiu in the base ISA and vmsgtu.vi in the vector extension.)
toggle quoted message
Show quoted text
From chapter 11, section 1 (#3):
The 5-bit immediate is unsigned when either providing a register index in vrgather or a count for shift, clip, or slide. In all other cases
it is signed and sign extended to SEW bits, even for bitwise and unsigned instructions, notably compare and add.
From chapter 13, section 1: Saturating forms of integer add and subtract are provided, for both signed and unsigned integers. If the result would overflow the destination, the result is replaced with the closest representable value, and the vxsat bit is set.
This results in a conundrum: operation SEW RS1 RS2 vsaddu.vv 8 0x0ff 0x01 vsaddu.vi 8 0x01f 0x01
These two operations now provide a difference of result. Taking the maximum unsigned integer value, adding one, causes saturation. The result value for the vector-vector operation would be 0xff and the VXSAT bit would be set. This shouldn't be a surprise. However, the immediate form is more difficult. The immediate value is sign-extended to SEW size and treated as a signed value. This means the arithmetic is now (-1) + 1 = 0. This does not create a saturation (a value outside expected return parameters). The result value from the vector-immediate operation would be 0x1f and the VXSAT bit would be clear.
This is from the specification, as written, in a strict sense.
From a use-case sense, what is trying to be accomplished, here? Two counter perspectives: 1 - from a use-case perspective, why would a programmer or compiler specifically pick an unsigned operation, only to operate on values using a signed-immediate in a signed format? I'm curious that this case is. 2 - from an architecture/implementation perspective, this is the first time that an engine will have to operate on an instruction differently based on the *source* of the operand. That is, more narrowly, the arithmetic engines are given an operation encoding (usually an "onto" mapping from the opcode space) and operands, but does not care where the operations came from. In other words, the vector engine itself would receive a full bit set in RS1 for both cases, above, for a saturating unsigned (sorta) add. However, the outcome is required to be different?
I would imagine others have run into this situation, and I'd like to know both the intent of having a signed-immediate value for this unsigned operation, as well as the applicability of section 11.1 to this instruction.
|
|
Signed v Unsigned Immediate: vsaddu.vi
From chapter 11, section 1 (#3):
The 5-bit immediate is unsigned when either providing a register index in vrgather or a count for shift, clip, or slide. In all other cases
it is signed and sign extended to SEW bits, even for bitwise and unsigned instructions, notably compare and add.
From chapter 13, section 1: Saturating forms of integer add and subtract are provided, for both signed and unsigned integers. If the result would overflow the destination, the result is replaced with the closest representable value, and the vxsat bit is set.
This results in a conundrum: operation SEW RS1 RS2 vsaddu.vv 8 0x0ff 0x01 vsaddu.vi 8 0x01f 0x01
These two operations now provide a difference of result. Taking the maximum unsigned integer value, adding one, causes saturation. The result value for the vector-vector operation would be 0xff and the VXSAT bit would be set. This shouldn't be a surprise. However, the immediate form is more difficult. The immediate value is sign-extended to SEW size and treated as a signed value. This means the arithmetic is now (-1) + 1 = 0. This does not create a saturation (a value outside expected return parameters). The result value from the vector-immediate operation would be 0x1f and the VXSAT bit would be clear.
This is from the specification, as written, in a strict sense.
From a use-case sense, what is trying to be accomplished, here? Two counter perspectives: 1 - from a use-case perspective, why would a programmer or compiler specifically pick an unsigned operation, only to operate on values using a signed-immediate in a signed format? I'm curious that this case is. 2 - from an architecture/implementation perspective, this is the first time that an engine will have to operate on an instruction differently based on the *source* of the operand. That is, more narrowly, the arithmetic engines are given an operation encoding (usually an "onto" mapping from the opcode space) and operands, but does not care where the operations came from. In other words, the vector engine itself would receive a full bit set in RS1 for both cases, above, for a saturating unsigned (sorta) add. However, the outcome is required to be different?
I would imagine others have run into this situation, and I'd like to know both the intent of having a signed-immediate value for this unsigned operation, as well as the applicability of section 11.1 to this instruction.
|
|
Cancelling Vector TG meeting today

Krste Asanovic
Sorry for late notice, but I have to cancel the vector tech meeting today,
Krste
|
|
Re: GNU toolchain with RVV intrinsic support
Thank you for the clarification. Excellent.
toggle quoted message
Show quoted text
On Mon, Aug 24, 2020, 17:35 Bruce Hoult, < bruce@...> wrote: On Tue, Aug 25, 2020 at 5:34 AM David Horner <ds2horner@...> wrote:
Thank you very much for this advancement.
I have two concerns, in the body is a response.
.
On 2020-08-21 9:34 a.m., Kito Cheng
wrote:
I am pleased to announce that our/SiFive's RVV
intrinsic enabled GCC are open-sourced now.
We put the sources on riscv's github, and the RVV intrinsics
have been integrated in the riscv-gnu-toolchain, so you can
build the RVV intrinsic enabled GNU toolchain as usual.
$ git clone git@...:riscv/riscv-gnu-toolchain.git -b
rvv-intrinsic
$ <path-to-riscv-gnu-toolchain>/configure
--with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
$ make newlib build-qemu
$ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv
Shouldn't this be vec_add32_rvv ? It is not a generalized vector
add.
The user can call functions anything they want. The example might be better if this was clear by calling it foo() or demo_vector_add() or something.
(int *a, int *b, int *c, size_t n) {
> size_t vl;
> vint32m2_t va, vb, vc;
> for (;vl = vsetvl_e32m2 (n);n -= vl) {
> vb = vle32_v_i32m2 (b);
> vc = vle32_v_i32m2 (c);
> va = vadd_vv_i32m2 (vb, vc);
> vse32_v_i32m2 (a, va);
> a += vl;
The vector pointer should be advanced by vl * 32.
The variable "a" in an "int *" pointer. When you add an integer to it C automatically scales the integer (vl) by sizeof(int).
|
|
Re: GNU toolchain with RVV intrinsic support

Bruce Hoult
On Tue, Aug 25, 2020 at 5:34 AM David Horner <ds2horner@...> wrote:
Thank you very much for this advancement.
I have two concerns, in the body is a response.
.
On 2020-08-21 9:34 a.m., Kito Cheng
wrote:
I am pleased to announce that our/SiFive's RVV
intrinsic enabled GCC are open-sourced now.
We put the sources on riscv's github, and the RVV intrinsics
have been integrated in the riscv-gnu-toolchain, so you can
build the RVV intrinsic enabled GNU toolchain as usual.
$ git clone git@...:riscv/riscv-gnu-toolchain.git -b
rvv-intrinsic
$ <path-to-riscv-gnu-toolchain>/configure
--with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
$ make newlib build-qemu
$ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv
Shouldn't this be vec_add32_rvv ? It is not a generalized vector
add.
The user can call functions anything they want. The example might be better if this was clear by calling it foo() or demo_vector_add() or something.
(int *a, int *b, int *c, size_t n) {
> size_t vl;
> vint32m2_t va, vb, vc;
> for (;vl = vsetvl_e32m2 (n);n -= vl) {
> vb = vle32_v_i32m2 (b);
> vc = vle32_v_i32m2 (c);
> va = vadd_vv_i32m2 (vb, vc);
> vse32_v_i32m2 (a, va);
> a += vl;
The vector pointer should be advanced by vl * 32.
The variable "a" in an "int *" pointer. When you add an integer to it C automatically scales the integer (vl) by sizeof(int).
|
|
Re: GNU toolchain with RVV intrinsic support
Thank you very much for this advancement.
I have two concerns, in the body is a response.
.
On 2020-08-21 9:34 a.m., Kito Cheng
wrote:
I am pleased to announce that our/SiFive's RVV
intrinsic enabled GCC are open-sourced now.
We put the sources on riscv's github, and the RVV intrinsics
have been integrated in the riscv-gnu-toolchain, so you can
build the RVV intrinsic enabled GNU toolchain as usual.
$ git clone git@...:riscv/riscv-gnu-toolchain.git -b
rvv-intrinsic
$ <path-to-riscv-gnu-toolchain>/configure
--with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
$ make newlib build-qemu
$ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv
Shouldn't this be vec_add32_rvv ? It is not a generalized vector
add.
(int *a, int *b, int *c, size_t n) {
> size_t vl;
> vint32m2_t va, vb, vc;
> for (;vl = vsetvl_e32m2 (n);n -= vl) {
> vb = vle32_v_i32m2 (b);
> vc = vle32_v_i32m2 (c);
> va = vadd_vv_i32m2 (vb, vc);
> vse32_v_i32m2 (a, va);
> a += vl;
The vector pointer should be advanced by vl * 32.
(I originally thought the vl = vsetvl may have done the by 32
scaling and that n was in bytes,
but I have now convinced myself that the problem is likely the
pointer advance,
and the VLEN is at least 256 so only one pass of the loop for the
below test case.)
> b += vl;
> c += vl;
> }
> }
>
> int x[10] = {1,2,3,4,5,6,7,8,9,0};
> int y[10] = {0,9,8,7,6,5,4,3,2,1};
> int z[10];
>
> int main()
> {
> int i;
> vec_add_rvv(z, x, y, 10);
> for (i=0; i<10; i++)
> printf ("%d ", z[i]);
> printf("\n");
> return 0;
> }
$ riscv64-unknown-elf-gcc rvv_vadd.c -O2
$ qemu-riscv64 -cpu
rv64,x-v=true,vlen=256,elen=64,vext_spec=v1.0 a.out
It is verified with our internal testsuite and several internal
projects, however this project is still a work in progress, and
we intend to improve the work continually. Feedback and bug
reports are welcome, as well as contributions and pull-requests.
Current status:
- Implement ~95% RVV intrinsic function listed in the intrinsic
spec ( https://github.com/riscv/rvv-intrinsic-doc)
- FP16 supported for both vector and scalar.
- fp16 uses __fp16 temporally, this might change in future.
- Fractional LMUL is not implemented yet.
- RV32 is not well supported for scalar-vector operations with
SEW=64.
- Function call with vector type is not well supported yet,
arguments will be passed/returned in memory in current
implementation.
- *NO* auto vectorization support.
|
|
Re: V extension groups analogue to the standard groups

mark
Just a reminder that we will differentiate between branding (i.e. what we trademark and what members can advertise) and internal use (like uname in linux vs. splash screen, etc.).
the proposed policy is under review in the policies/proposed folder
toggle quoted message
Show quoted text
On Sun, Aug 23, 2020 at 3:26 PM Simon Davidmann Imperas < simond@...> wrote: thanks - I am OK with whichever you choose.
On Sat, Aug 22, 2020 at 12:30 AM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...)
V: Very Basic Vector Instructions (VSETVL, ...)
MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ...
VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway.
Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
--
====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|
Re: V extension groups analogue to the standard groups

Simon Davidmann Imperas
thanks - I am OK with whichever you choose.
toggle quoted message
Show quoted text
On Sat, Aug 22, 2020 at 12:30 AM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...)
V: Very Basic Vector Instructions (VSETVL, ...)
MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ...
VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway.
Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|
Re: V extension groups analogue to the standard groups

Krste Asanovic
Anybody is free to use any subset of supported instructions and element widths/types. The Z names can be extended down to individual instructions/width if necessary. However, we have to guide the software ecosystem where to spend the available finite effort. So we choose and name some common combinations to inform software/tool providers what to support, and to enable compliance testing of those combinations. We can always add new Z names later for subsets that prove popular. This can happen after the instruction spec itself is ratified, in a much lighter-weight process. Krste On Sat, 22 Aug 2020 00:32:56 -0700, "Allen Baum" <allen.baum@...> said:
| Works for me. | -Allen | On Aug 21, 2020, at 11:41 PM, Andrew Waterman <andrew@...> wrote: | It's OK for esoteric combinations to require long ISA strings, I think. |
|
|
Re: V extension groups analogue to the standard groups

Allen Baum
toggle quoted message
Show quoted text
On Aug 21, 2020, at 11:41 PM, Andrew Waterman < andrew@...> wrote: It's OK for esoteric combinations to require long ISA strings, I think.
|
|
Re: V extension groups analogue to the standard groups
For layout reasons, I can easily imagine a vector unit that has multiply HW for vector registers, but can't easily use them to implement scalar multiply/divide. Whether someone would ever want to implement a system that implements vector multiply/divide but not scalar multiply/divide is, and should be, a completely separate issue; I see no reason why they need to be tied together. IF there are no profiles that have vector mujl without scalar mul, then no one will be implementing that configuration and this is a moot point - but no harm to allow it. There are probably hundreds of configurations that won't be covered by profiles, and we can't obsess about them either. (hundreds are likely a very, very conservative estimate)
Under the scheme I'm promulgating, it's true that you couldn't describe your hypothetical machine as implementing capital-letter "V". Perhaps it could be an RV32I_Zvbase_Zvm machine or something? It's OK for esoteric combinations to require long ISA strings, I think.
On Fri, Aug 21, 2020 at 4:30 PM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...)
V: Very Basic Vector Instructions (VSETVL, ...)
MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ...
VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway.
Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|
Re: V extension groups analogue to the standard groups

Allen Baum
For layout reasons, I can easily imagine a vector unit that has multiply HW for vector registers, but can't easily use them to implement scalar multiply/divide. Whether someone would ever want to implement a system that implements vector multiply/divide but not scalar multiply/divide is, and should be, a completely separate issue; I see no reason why they need to be tied together. IF there are no profiles that have vector mujl without scalar mul, then no one will be implementing that configuration and this is a moot point - but no harm to allow it. There are probably hundreds of configurations that won't be covered by profiles, and we can't obsess about them either. (hundreds are likely a very, very conservative estimate)
toggle quoted message
Show quoted text
On Fri, Aug 21, 2020 at 4:30 PM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...)
V: Very Basic Vector Instructions (VSETVL, ...)
MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ...
VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway.
Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|
Re: V extension groups analogue to the standard groups
toggle quoted message
Show quoted text
On Fri, Aug 21, 2020 at 5:49 PM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 4:51 PM Guy Lemieux < glemieux@...> wrote: I think a common embedded and FPGA scenarios will be F on the scalar side but no F on the vector side. Adding F to V is nontrivial in area, particularly for FPGAs that lack FPUs, yet an integer-only V makes a lot of sense for pixel processing etc. F in the scalar is a nice-to-have for code size and to calculate scalar parameters, eg in OpenCV and OpenVX.
The current nomenclature assumptions don’t allow this, but I think that they should do so.
We definitely want to sanction configurations that have different datatype support on scalar and vector. The current thinking is that the letter V means "whatever apps-profile processors want", just like what "G" means on the scalar side. Perhaps the "vector-with-fewer-datatypes-than-scalar" case can be described as Zvbase instead of V?
Guy On Fri, Aug 21, 2020 at 4:30 PM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...) V: Very Basic Vector Instructions (VSETVL, ...) MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ... VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway. Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|
Re: V extension groups analogue to the standard groups
On Fri, Aug 21, 2020 at 4:51 PM Guy Lemieux < glemieux@...> wrote: I think a common embedded and FPGA scenarios will be F on the scalar side but no F on the vector side. Adding F to V is nontrivial in area, particularly for FPGAs that lack FPUs, yet an integer-only V makes a lot of sense for pixel processing etc. F in the scalar is a nice-to-have for code size and to calculate scalar parameters, eg in OpenCV and OpenVX.
The current nomenclature assumptions don’t allow this, but I think that they should do so.
We definitely want to sanction configurations that have different datatype support on scalar and vector. The current thinking is that the letter V means "whatever apps-profile processors want", just like what "G" means on the scalar side. Perhaps the "vector-with-fewer-datatypes-than-scalar" case can be described as Zvbase instead of V?
Guy On Fri, Aug 21, 2020 at 4:30 PM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...) V: Very Basic Vector Instructions (VSETVL, ...) MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ... VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway. Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|
Re: V extension groups analogue to the standard groups
I think a common embedded and FPGA scenarios will be F on the scalar side but no F on the vector side. Adding F to V is nontrivial in area, particularly for FPGAs that lack FPUs, yet an integer-only V makes a lot of sense for pixel processing etc. F in the scalar is a nice-to-have for code size and to calculate scalar parameters, eg in OpenCV and OpenVX.
The current nomenclature assumptions don’t allow this, but I think that they should do so.
Guy
toggle quoted message
Show quoted text
On Fri, Aug 21, 2020 at 4:30 PM Andrew Waterman < andrew@...> wrote:
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...) V: Very Basic Vector Instructions (VSETVL, ...) MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ... VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway. Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|
Re: V extension groups analogue to the standard groups
On Fri, Aug 21, 2020 at 2:43 PM Simon Davidmann < simond@...> wrote: A question to clarify. You state: RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
This means if I understand you that we need to add F to get F hardware in the vector unit - so RV32IV means V with no F hardware, and RV32IFV includes F hardware.
So for consistency...
What does RV32IV means for M hardware multiply - do I need to RV32IMV to get scalar and vector hardware multiply?
I don’t believe the spec explicitly addresses this question, but I agree it makes sense. Alternatively, V could require M, since it doesn’t make much sense to pay for a vector unit but be too stingy to pay for a multiplier. But that might be less consistent. (My recommendation is that RV32IV continue to mean “no multiplier”, even though it’s a silly configuration.)
RV32IV means no F and no M hardware? - so I need to explicitly include the extensions I need as V assumes nothing but I?
My recommendation is to clarify in the spec that RV32IV is a valid config with no FPU in the vector unit, and RV32IFV is also a valid config with an FPU in both scalar and vector.
Or is something assumed for M?
If we choose to define that V implies M, RV32IV and RV32IMV would be synonyms.
thanks
On Thu, Aug 20, 2020 at 8:48 PM Andrew Waterman < andrew@...> wrote: Quad-widening ops have been moved to a separate extension, Zvqmac.
I believe the intent is that the capital-V V extension supports the same FP datatypes as the scalar ISA, so e.g., RV32IV doesn’t mandate any FP hardware in the vector unit, whereas RV32IFV means both scalar and vector support single-precision, etc.
I’m surprised all those hashtags made it past the spam filter! On Thu, Aug 20, 2020 at 11:42 AM Strauch, Tobias (HENSOLDT Cyber GmbH) < tobias.strauch@...> wrote:
Apologies if this is old stuff already dismissed. But I give it a try anyway.
Wouldn't it make sense to separate more complex vector instructions from more trivial ones? Already with the very first base release ? Vector instructions can also be helpful in small devices #IOT #Edge #GAP8
#RISCY without the need to fully support floating point instructions or without the need for a quad multiply.
The suggestion would be to basically group vector extensions analogue to the standard instructions (I, M, F, D, Q, …), instead of having an already complex base and then subtract or re-define subsets of instructions
again ?
Wouldn't that be in-line with the RISC-V philosophy of modularity and simplicity ? The beauty would be that you have a non-vector and a vector group version.
Possible nomenclature based on order:
M: Standard Multiply Divide Instructions (MUL, ...)
V: Very Basic Vector Instructions (VSETVL, ...)
MV: Standard Multiply Divide Instructions and Very Basic Vector Instructions (MUL, VSETVL, ...
VM: Standard Multiply Divide Instructions, Very Basic Vector Instructions and Vector Integer Multiply\Divide Instructions (MUL, VSETVL, VMUL, ...)
F, D, Q analogue to M as suggested.
The V version will not be a 1:1 match with the standard version and will cover additional aspects. But it can be argued, that when you implement the V version (of M, F, D, Q, ...), then you most likely will have
the relevant standard counterparts implemented as well anyway.
Kind Regards, Tobias
-- ====================================================================
The information contained in this electronic mail message and any attachments hereto is privileged and confidential information intended only for the use of the individual or entity named above or their designee. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error please immediately notify us by return message or by telephone and delete the original message from your mail system. Thank you. ====================================================================
|
|