I think I understand how I confused the situation.
Issue #458 introduced idea of using rd and rs1 values to encode more bits for vsetvli.
I proposed that this become the only vsetvli format.
Krste countered that the current format could be expanded later if needed to adopt the new format as long as a field encoding was otherwise unused.
I agreed that this was technically possible. But I did not raise a concern that this would have potential negative consequences.
In the meantime, I opened #460, which in addition to the rd and rs1 encoding, avoided using a bit within vtype to allow for vl calculation based on lmul of 3,5,6 or 7.
I my mind, #460 raised all the concerns and considerations present in #458.
Further, it provided additional support for the rd/rs1 format by using the novel encoding is a unique way.
As a result I closed #458 to have all the relevant discussion tracked on #460.
It could, however, easily have been inferred that I closed #458 because the "escape mechanism" was perfect.
The closing comments in #458 however explicitly recommend the concern be revisited as V1.0 approaches.
If the “resolve for v1.0’ label had been available then I likely would have suggested it for #458 and definitely for #460.
The intro in #460 also infers the need to give early consideration to this format:
A1.On 2020-06-30 11:12 a.m., Krste Asanovic wrote:
For 1.0, we are just trying to fix vsew, vlmul, vma, and vta (and also vill in vtype, but that’s out of vsetvli immediate range).If this is indeed true, then this makes the fields candidates for vtype fields that are only set by vsetvl (those in range [XLEN-2:11])
and agnostic provides significant PPA benefit for renamed register machines, especially with long vectors.I agree they likely have merit, I advocated for their inclusion in vtype, and in vsetvli.
I am proposing that we seriously consider the consequences of providing a vsetvli instruction that has as limited an immediate field.
There are alternatives, #458 and #460 are two such that increase functionality(complete lmul range) and immediate bit encoding (by up to 6 bits)..
Using vlmul = "100" for vsetvl opcode decoding rather than the immediate sign bit [ bit 31] is another low cost approach that recovers a bit.
And of course there are other alternatives.
Are you saying that vsew, vlmul, vma, vta should not be in the vsetvli immediate space?As reasoned above, vma and vta are candidates to be removed.
Conversely, vsew and vlmul are prime candidates for inclusion in the vsetvli immediate space because:
they are essential to the "set vl" function, and
they are common modifiers to base operations (as in the expected 64bit op-code space) and
they are often used in conjunction with one another and
many code examples show sew/lmul variation within typical loops.
This is another aspect that needs to form part of the reasoning about the sufficiency of vsetvli immediate space:
Pressure on immediate form of the instruction would be drastically diminished If
only those fields that definitively provide an appreciable benefit to code efficiency are included.
(in particular, if the field can be hoisted from the loop it is not a good candidate).
To me, the combination of removing vma and vta from the immediate and
using lmul="100" for vsetvl encoding
removes sufficient pressure that
an immediate bit could be used to expand lmul to 3,5,6 and 7 and
still provide for judicious inclusion of warranted future immediates
for years, without invoking the rd/sr1 encoding.
However, switching to rd/rs1 encoding does provide a substantial margin for error and neatly addresses the lmul=3,5,6 and 7 concern..