For those not following on github:
On 2020-04-19 1:00 a.m., Krste
Asanovic wrote:
....
I think for this reason
[re-positioned below]
we should stick with fixed vle8, vle16, vle32,
vle, in base encoding {8,16,32,SEW}. These are more readable
and can
be used to interleave load/stores of larger than SEW values
without
changing vtype (e.g., moving 32b values when SEW=16).
[re-positioned here]
For load/store instructions, we could modify these to have
relative
element widths that are fractions of SEW {SEW, SEW/2, SEW/4,
SEW/8}.
Assembly syntax could be something like vle, vlef2, vlef4,
vlfe8.
There is a challenge in readability that these are relative
to last
vtype setting,
SEW and LMUL values are essential to correct code execution
regardless of load/store width encoding.
They should be assembler directive variables set automatically
by vsetvli (and vsetvl when its xs2 argument is statically
defined).
For dynamic xs2 and vsetvl, a manual assembler directive should
be available.
This should help in various situations, including validation
that SEW/LMUL ratio is maintained by a given vsetvli, and also
for load/store syntax:
With this in place the assembler code can translate e8 to the
corre
and also when SEW is less than 64,
some become useless.
ct SEW * factor value in load/stores eliminating the readability
concern.
I agree that the base should have as robust an encoding without
over committing available bits.
Thus I would want to also want to move 32b when SEW=16, and in
addition
- move 64b values when SEW=16
- move 64b values when SEW=256, or 128 or 512
- and various more combinations
- and not waste encoding when SEW < 64.
I also believe that load/store are so important, so pivotal
(e.g. matrix transforms) that flexibility and efficiency are
both mandated.
The compress load/store format seeks to address efficiency.
The encoding needs the flexibility of SEW * factors.
I propose the factors be depent upon current SEW value
For SEW=8 the encoding yields factors of 1,2,4 and 8
For SEW=16 the encoding yields factors of 1/2, 1, 2 and 4
For SEW of 32 and above the encoding yields factors of 1/4,
1/2, 1 and 2.
Thus we always support LEW = SEW *2 operations and support
load/store SEW/2 and SEW/4 when they exist.