Thank you very much for this advancement.
I have two concerns, in the body is a response.
.
On 2020-08-21 9:34 a.m., Kito Cheng
wrote:
I am pleased to announce that our/SiFive's RVV
intrinsic enabled GCC are open-sourced now.
We put the sources on riscv's github, and the RVV intrinsics
have been integrated in the riscv-gnu-toolchain, so you can
build the RVV intrinsic enabled GNU toolchain as usual.
$ git clone git@...:riscv/riscv-gnu-toolchain.git -b
rvv-intrinsic
$ <path-to-riscv-gnu-toolchain>/configure
--with-arch=rv64gcv_zfh --prefix=<INSTALL-PATH>
$ make newlib build-qemu
$ cat rvv_vadd.c
>
> #include <riscv_vector.h>
> #include <stdio.h>
>
> void vec_add_rvv
Shouldn't this be vec_add32_rvv ? It is not a generalized vector
add.
(int *a, int *b, int *c, size_t n) {
> size_t vl;
> vint32m2_t va, vb, vc;
> for (;vl = vsetvl_e32m2 (n);n -= vl) {
> vb = vle32_v_i32m2 (b);
> vc = vle32_v_i32m2 (c);
> va = vadd_vv_i32m2 (vb, vc);
> vse32_v_i32m2 (a, va);
> a += vl;
The vector pointer should be advanced by vl * 32.
(I originally thought the vl = vsetvl may have done the by 32
scaling and that n was in bytes,
but I have now convinced myself that the problem is likely the
pointer advance,
and the VLEN is at least 256 so only one pass of the loop for the
below test case.)
> b += vl;
> c += vl;
> }
> }
>
> int x[10] = {1,2,3,4,5,6,7,8,9,0};
> int y[10] = {0,9,8,7,6,5,4,3,2,1};
> int z[10];
>
> int main()
> {
> int i;
> vec_add_rvv(z, x, y, 10);
> for (i=0; i<10; i++)
> printf ("%d ", z[i]);
> printf("\n");
> return 0;
> }
$ riscv64-unknown-elf-gcc rvv_vadd.c -O2
$ qemu-riscv64 -cpu
rv64,x-v=true,vlen=256,elen=64,vext_spec=v1.0 a.out
It is verified with our internal testsuite and several internal
projects, however this project is still a work in progress, and
we intend to improve the work continually. Feedback and bug
reports are welcome, as well as contributions and pull-requests.
Current status:
- Implement ~95% RVV intrinsic function listed in the intrinsic
spec ( https://github.com/riscv/rvv-intrinsic-doc)
- FP16 supported for both vector and scalar.
- fp16 uses __fp16 temporally, this might change in future.
- Fractional LMUL is not implemented yet.
- RV32 is not well supported for scalar-vector operations with
SEW=64.
- Function call with vector type is not well supported yet,
arguments will be passed/returned in memory in current
implementation.
- *NO* auto vectorization support.
|