GNU toolchain with RVV intrinsic support


David Horner
 

Thank you for the clarification. 
Excellent.

On Mon, Aug 24, 2020, 17:35 Bruce Hoult, <bruce@...> wrote:
On Tue, Aug 25, 2020 at 5:34 AM David Horner <ds2horner@...> wrote:
Thank you very much for this advancement.
I have two concerns, in the body is a response.
.

On 2020-08-21 9:34 a.m., Kito Cheng wrote:
I am pleased to announce that our/SiFive's RVV intrinsic enabled GCC are open-sourced now.

We put the sources on riscv's github, and the RVV intrinsics have been integrated in the riscv-gnu-toolchain, so you can build the RVV intrinsic enabled GNU toolchain as usual.

 $ git clone git@...:riscv/riscv-gnu-toolchain.git -b rvv-intrinsic
 $ <path-to-riscv-gnu-toolchain>/configure --with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
 $ make newlib build-qemu
 $ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv
Shouldn't this be vec_add32_rvv ? It is not a generalized vector add.

The user can call functions anything they want. The example might be better if this was clear by calling it foo() or demo_vector_add() or something.
 
(int *a, int *b, int *c, size_t n) {
>   size_t vl;
>   vint32m2_t va, vb, vc;
>   for (;vl = vsetvl_e32m2 (n);n -= vl) {
>     vb = vle32_v_i32m2 (b);
>     vc = vle32_v_i32m2 (c);
>     va = vadd_vv_i32m2 (vb, vc);
>     vse32_v_i32m2 (a, va);
>     a += vl;
The vector pointer should be advanced by vl * 32.

The variable "a" in an "int *" pointer. When you add an integer to it C automatically scales the integer (vl) by sizeof(int).


Bruce Hoult
 

On Tue, Aug 25, 2020 at 5:34 AM David Horner <ds2horner@...> wrote:
Thank you very much for this advancement.
I have two concerns, in the body is a response.
.

On 2020-08-21 9:34 a.m., Kito Cheng wrote:
I am pleased to announce that our/SiFive's RVV intrinsic enabled GCC are open-sourced now.

We put the sources on riscv's github, and the RVV intrinsics have been integrated in the riscv-gnu-toolchain, so you can build the RVV intrinsic enabled GNU toolchain as usual.

 $ git clone git@...:riscv/riscv-gnu-toolchain.git -b rvv-intrinsic
 $ <path-to-riscv-gnu-toolchain>/configure --with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
 $ make newlib build-qemu
 $ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv
Shouldn't this be vec_add32_rvv ? It is not a generalized vector add.

The user can call functions anything they want. The example might be better if this was clear by calling it foo() or demo_vector_add() or something.
 
(int *a, int *b, int *c, size_t n) {
>   size_t vl;
>   vint32m2_t va, vb, vc;
>   for (;vl = vsetvl_e32m2 (n);n -= vl) {
>     vb = vle32_v_i32m2 (b);
>     vc = vle32_v_i32m2 (c);
>     va = vadd_vv_i32m2 (vb, vc);
>     vse32_v_i32m2 (a, va);
>     a += vl;
The vector pointer should be advanced by vl * 32.

The variable "a" in an "int *" pointer. When you add an integer to it C automatically scales the integer (vl) by sizeof(int).


David Horner
 

Thank you very much for this advancement.
I have two concerns, in the body is a response.
.

On 2020-08-21 9:34 a.m., Kito Cheng wrote:
I am pleased to announce that our/SiFive's RVV intrinsic enabled GCC are open-sourced now.

We put the sources on riscv's github, and the RVV intrinsics have been integrated in the riscv-gnu-toolchain, so you can build the RVV intrinsic enabled GNU toolchain as usual.

 $ git clone git@...:riscv/riscv-gnu-toolchain.git -b rvv-intrinsic
 $ <path-to-riscv-gnu-toolchain>/configure --with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
 $ make newlib build-qemu
 $ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv
Shouldn't this be vec_add32_rvv ? It is not a generalized vector add.
(int *a, int *b, int *c, size_t n) {
>   size_t vl;
>   vint32m2_t va, vb, vc;
>   for (;vl = vsetvl_e32m2 (n);n -= vl) {
>     vb = vle32_v_i32m2 (b);
>     vc = vle32_v_i32m2 (c);
>     va = vadd_vv_i32m2 (vb, vc);
>     vse32_v_i32m2 (a, va);
>     a += vl;
The vector pointer should be advanced by vl * 32.
(I originally thought the vl = vsetvl may have done the by 32 scaling and that n was in bytes,
but I have now convinced myself that the problem is likely the pointer advance,
 and the VLEN is  at least 256 so only one pass of the loop for the below test case.)
>     b += vl;
>     c += vl;
>   }
> }
>
> int x[10] = {1,2,3,4,5,6,7,8,9,0};
> int y[10] = {0,9,8,7,6,5,4,3,2,1};
> int z[10];
>
> int main()
> {
>   int i;
>   vec_add_rvv(z, x, y, 10);

>   for (i=0; i<10; i++)
>     printf ("%d ", z[i]);
>   printf("\n");
>   return 0;
> }

 $ riscv64-unknown-elf-gcc rvv_vadd.c -O2
 $ qemu-riscv64 -cpu rv64,x-v=true,vlen=256,elen=64,vext_spec=v1.0 a.out

It is verified with our internal testsuite and several internal projects, however this project is still a work in progress, and we intend to improve the work continually. Feedback and bug reports are welcome, as well as contributions and pull-requests.

Current status:
- Implement ~95% RVV intrinsic function listed in the intrinsic spec (https://github.com/riscv/rvv-intrinsic-doc)
- FP16 supported for both vector and scalar.
  - fp16 uses __fp16 temporally, this might change in future.
- Fractional LMUL is not implemented yet.
- RV32 is not well supported for scalar-vector operations with SEW=64.
- Function call with vector type is not well supported yet, arguments will be passed/returned in memory in current implementation.
- *NO* auto vectorization support.


Kito Cheng
 

I am pleased to announce that our/SiFive's RVV intrinsic enabled GCC are open-sourced now.

We put the sources on riscv's github, and the RVV intrinsics have been integrated in the riscv-gnu-toolchain, so you can build the RVV intrinsic enabled GNU toolchain as usual.

 $ git clone git@...:riscv/riscv-gnu-toolchain.git -b rvv-intrinsic
 $ <path-to-riscv-gnu-toolchain>/configure --with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
 $ make newlib build-qemu
 $ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv(int *a, int *b, int *c, size_t n) {
>   size_t vl;
>   vint32m2_t va, vb, vc;
>   for (;vl = vsetvl_e32m2 (n);n -= vl) {
>     vb = vle32_v_i32m2 (b);
>     vc = vle32_v_i32m2 (c);
>     va = vadd_vv_i32m2 (vb, vc);
>     vse32_v_i32m2 (a, va);
>     a += vl;
>     b += vl;
>     c += vl;
>   }
> }
>
> int x[10] = {1,2,3,4,5,6,7,8,9,0};
> int y[10] = {0,9,8,7,6,5,4,3,2,1};
> int z[10];
>
> int main()
> {
>   int i;
>   vec_add_rvv(z, x, y, 10);
>   for (i=0; i<10; i++)
>     printf ("%d ", z[i]);
>   printf("\n");
>   return 0;
> }

 $ riscv64-unknown-elf-gcc rvv_vadd.c -O2
 $ qemu-riscv64 -cpu rv64,x-v=true,vlen=256,elen=64,vext_spec=v1.0 a.out

It is verified with our internal testsuite and several internal projects, however this project is still a work in progress, and we intend to improve the work continually. Feedback and bug reports are welcome, as well as contributions and pull-requests.

Current status:
- Implement ~95% RVV intrinsic function listed in the intrinsic spec (https://github.com/riscv/rvv-intrinsic-doc)
- FP16 supported for both vector and scalar.
  - fp16 uses __fp16 temporally, this might change in future.
- Fractional LMUL is not implemented yet.
- RV32 is not well supported for scalar-vector operations with SEW=64.
- Function call with vector type is not well supported yet, arguments will be passed/returned in memory in current implementation.
- *NO* auto vectorization support.