
Re: Smaller embedded version of the Vector extension
This is the requirement for app processors, which are not generally small cores.
Most competing SIMD extensions are at least 128b per vector register.
Lower performance on codes that work well on
This is the requirement for app processors, which are not generally small cores.
Most competing SIMD extensions are at least 128b per vector register.
Lower performance on codes that work well on

By
Krste Asanovic
·
#633
·


Re: Smaller embedded version of the Vector extension
What is the advantage to RVV requiring VLEN >= 128?
I think this should be changed to VLEN >= 64 because:
1) VLEN = 64 is more likely for small implementations; creating a
mandatory expectation to
What is the advantage to RVV requiring VLEN >= 128?
I think this should be changed to VLEN >= 64 because:
1) VLEN = 64 is more likely for small implementations; creating a
mandatory expectation to

By
Guy Lemieux
·
#632
·


Re: Smaller embedded version of the Vector extension
see github issue #550
Krste
see github issue #550
Krste

By
Krste Asanovic
·
#631
·


Re: Smaller embedded version of the Vector extension
This is a good question.
So if the RVM22 profile requires VLEN=32, ELEN=64, LMUL=8 then the vector registers will have the same amount of state as ARM MVE.
Tariq
This is a good question.
So if the RVM22 profile requires VLEN=32, ELEN=64, LMUL=8 then the vector registers will have the same amount of state as ARM MVE.
Tariq

By
Tariq Kurd
·
#630
·


Re: 答复: [RISCV] [techvectorext] Smaller embedded version of the Vector extension
Hi, Krste:
The RISCV V TG have the plan to support a lowcost vector extension in RVMxx profile?
Best Regards
Shaofei
2021.6.3
邮件原件
发件人: krste@...
Hi, Krste:
The RISCV V TG have the plan to support a lowcost vector extension in RVMxx profile?
Best Regards
Shaofei
2021.6.3
邮件原件
发件人: krste@...

By
Shaofei (B)
·
#629
·


Re: Smaller embedded version of the Vector extension
Hi Tony,
All of the vector permutation instructions can be simulated using the memory system. For example, vslide can be simulated by storing the vector register and loading it at an offset; vrgather
Hi Tony,
All of the vector permutation instructions can be simulated using the memory system. For example, vslide can be simulated by storing the vector register and loading it at an offset; vrgather

By
Nick Knight
·
#628
·


Re: Smaller embedded version of the Vector extension
Hi Bruce,
Do you mean vrgather instead of vslide?
I use vrgather_vx_* and vslidedown to perform a vector element rotate (and other things), see:
Hi Bruce,
Do you mean vrgather instead of vslide?
I use vrgather_vx_* and vslidedown to perform a vector element rotate (and other things), see:

By
Tony Cole
·
#627
·


Re: Smaller embedded version of the Vector extension
 could an extension just change state like the number of vector registers?

Don't understand tbis question  please elaborate.
Krste
 could an extension just change state like the number of vector registers?

Don't understand tbis question  please elaborate.
Krste

By
Krste Asanovic
·
#626
·


Re: Smaller embedded version of the Vector extension
Thanks, I must have missed this bit:
"4.5. Mapping with LMUL > 1 and ELEN > VLEN
If vector registers are grouped to support larger SEW, with ELEN > VLEN, the vector registers in the group are
Thanks, I must have missed this bit:
"4.5. Mapping with LMUL > 1 and ELEN > VLEN
If vector registers are grouped to support larger SEW, with ELEN > VLEN, the vector registers in the group are

By
Tony Cole
·
#625
·


Re: Smaller embedded version of the Vector extension
Section 4.5,
Krste
 On Wed, Jun 2, 2021 at 8:38 AM Andrew Waterman <andrew@...> wrote:
 It’s actually not fundamental to the ISA design that VLEN >= ELEN. An
 implementation with
Section 4.5,
Krste
 On Wed, Jun 2, 2021 at 8:38 AM Andrew Waterman <andrew@...> wrote:
 It’s actually not fundamental to the ISA design that VLEN >= ELEN. An
 implementation with

By
Krste Asanovic
·
#624
·


Re: Smaller embedded version of the Vector extension
could an extension just change state like the number of vector registers?
could an extension just change state like the number of vector registers?

By
mark
·
#623
·


Re: Smaller embedded version of the Vector extension
We do allow supported SEW to vary with LMUL, so implementation can
support singlewidth operations on SEW=64. See section 4.5,
Krste
 So, (on a 32x 32bit vector register machine) the widening and
We do allow supported SEW to vary with LMUL, so implementation can
support singlewidth operations on SEW=64. See section 4.5,
Krste
 So, (on a 32x 32bit vector register machine) the widening and

By
Krste Asanovic
·
#622
·


Re: Smaller embedded version of the Vector extension
The VLEN>=128 constraint is only for the application processor "V"
extension for the app profile  not for embedded vectors which can
have VLEN=32.
From spec Introduction:
'
The term base vector
The VLEN>=128 constraint is only for the application processor "V"
extension for the app profile  not for embedded vectors which can
have VLEN=32.
From spec Introduction:
'
The term base vector

By
Krste Asanovic
·
#621
·


Re: Smaller embedded version of the Vector extension
It seems that restriction of minimum LMUL=2 would be half number of vector registers and LMUL=4 would be 8 vector registers.
Thang
It seems that restriction of minimum LMUL=2 would be half number of vector registers and LMUL=4 would be 8 vector registers.
Thang

By
Thang Tran
·
#620
·


Re: Smaller embedded version of the Vector extension
Hi Bruce,
“I an not a fan of the vslide instructions. It seems they expose the size of the vector registers in a very unfortunate way. In particular they break down if VLEN=1. Most code would be
Hi Bruce,
“I an not a fan of the vslide instructions. It seems they expose the size of the vector registers in a very unfortunate way. In particular they break down if VLEN=1. Most code would be

By
Tony Cole
·
#619
·


Re: Smaller embedded version of the Vector extension
I think the concern here is lack of a clearly defined data layout pattern for such cases.
eg, should the LSBs be in the odd or even register half, or should it be implementationdefined?
Guy
I think the concern here is lack of a clearly defined data layout pattern for such cases.
eg, should the LSBs be in the odd or even register half, or should it be implementationdefined?
Guy

By
Guy Lemieux
·
#618
·


Re: Smaller embedded version of the Vector extension
For widening and narrowing instructions to work, the V spec depends upon changing SEW (to EEW) and LMUL (to EMUL), such that EEW/EMUL == SEW/LMUL. That is, to change the element size (widen or
For widening and narrowing instructions to work, the V spec depends upon changing SEW (to EEW) and LMUL (to EMUL), such that EEW/EMUL == SEW/LMUL. That is, to change the element size (widen or

By
Guy Lemieux
·
#617
·


Re: Smaller embedded version of the Vector extension
It’s actually not fundamental to the ISA design that VLEN >= ELEN. An implementation with VLEN=32 could support SEW=64 whenever LMUL >= 2. This approach starts to pose codegeneration headaches, but
It’s actually not fundamental to the ISA design that VLEN >= ELEN. An implementation with VLEN=32 could support SEW=64 whenever LMUL >= 2. This approach starts to pose codegeneration headaches, but

By
Andrew Waterman
·
#616
·


Re: Smaller embedded version of the Vector extension
OK, so it seems that to run our software (which Tony Cole referred to) we need VLEN>=64 for our embedded application.
Is there any scope for reducing the number of V registers? Could RV32E_Vmin have
OK, so it seems that to run our software (which Tony Cole referred to) we need VLEN>=64 for our embedded application.
Is there any scope for reducing the number of V registers? Could RV32E_Vmin have

By
Tariq Kurd
·
#615
·


Re: Smaller embedded version of the Vector extension
I an not a fan of the vslide instructions. It seems they expose the size of the vector registers in a very unfortunate way. In particular they break down if VLEN=1. Most code would be better off
I an not a fan of the vslide instructions. It seems they expose the size of the vector registers in a very unfortunate way. In particular they break down if VLEN=1. Most code would be better off

By
Bruce Hoult
·
#614
·
