Re: RISC-V Vector Task Group: fractional LMUL
I left the encoding unspecified in the proposal.
That was intentional as I saw various tradeoffs.
However, I now recommend the codes be in order of increasing VLMAX value as so:
vlmul |
mnemonic |
LMUL |
VLMAX |
#groups |
Grouped registers |
||
---|---|---|---|---|---|---|---|
0 |
0 |
0 |
lf16 |
1/16 |
VLEN/SEW/16 |
32 |
vn (single register, lower 1/16th) |
0 |
0 |
1 |
lf8 |
1/8 |
VLEN/SEW/8 |
32 |
vn (single register, lower 1/8th) |
0 |
1 |
0 |
lf4 |
1/4 |
VLEN/SEW/4 |
32 |
vn (single register, lower 1/4th) |
0 |
1 |
1 |
lf2 |
1/2 |
VLEN/SEW/2 |
32 |
vn (single register, lower 1/2th) |
1 |
0 |
0 |
l1 |
1 |
VLEN/SEW |
32 |
vn (single register in group) |
1 |
0 |
1 |
l2 |
2 |
2*VLEN/SEW |
16 |
vn, vn+1 |
1 |
1 |
0 |
l4 |
4 |
4*VLEN/SEW |
8 |
vn, …, vn+3 |
1 |
1 |
1 |
l8 |
8 |
8*VLEN/SEW |
4 |
vn, …, vn+7 |
Special lmul code in vsetvl{i} to derive LML from existing SEW/LMUL and provided sew code.
As mentioned in the TG we suggested widen LMUL to 3 bits with 7 (explicit) states.
I suggest we name them l1,l2,l4,l8 as before and lf2, lf4, lf8 the fractional 1/2, 1/4, 1/8 respectively.
lf16 may be useful in larger machines, but already lf8 cannot be used on minimal RV32 machines.
( I recommend we use lf2, lf4 and lf8 rather than f2, f4, f8 to avoid confusion with float and to associate with lmul as a fraction value)
I propose the remaining value be used to derive LMUL from the existing SEW/LMUL and provided sew code.
Specifically, new LMUL = ( new sew / old SEW ) * old LMUL which retains the initial SEW/LMUL ratio.