Re: Issue #365 vsetvl{i} x0, x0 instruction forms

Krste Asanovic

On Thu, 23 Jul 2020 03:27:03 -0700, Andrew Waterman <andrew@...> said:
| On Wed, Jul 22, 2020 at 11:42 PM <krste@...> wrote:
|||||| On Wed, 22 Jul 2020 23:37:02 -0700, Andrew Waterman <andrew@...> said:
| | On Wed, Jul 22, 2020 at 11:19 PM David Horner <ds2horner@...> wrote:
| |     #3) If vill is set should vl remain unchanged? (I vote for yes).
| | Other vsetvl[i] instructions that set vill=1 also set vl=0.  Deviating from that course would be needlessly
| painful and not especially beneficial.
| It does add a non-orthogonality, but it is certainly beneficial in
| renamed machines to know that vl is never changed by the instruction.

| Disagreed. It’s fine to treat vsetvl instructions that set vill as pipeline flushes. Uarch can therefore assume vl
| isn’t changed.

There is still a non-trivial complexity/verification cost here versus
never changing vl.

But I think there's another detail we've been overlooking that we need
to consider.

The vsetvl variant with vtype as register operand is used to restore
vector register state after a context swap. It is not currently
clearly specified, but in the case that the restored vtype value has
vill bit set, the current text implies vl should be cleared.

Section 6.1
"If the vtype setting is not supported by the implementation, then the
vill bit is set in vtype, the remaining bits in vtype are set to zero,
and the vl register is also set to zero."

If we are to allow vl to be set to any value when new vtype.vill=1,
then we have to define rule for how source vl value affects the vl
CSR. This would simply be "truncate to number of supported vl bits",
though we need to consider the (small) cost of implementing this rule
correctly when the priv architecture supports emulating shorter VLEN
in lower privileged levels.

I think there are two orthogonal decisions to take:

"vill=1 on SEW/LMUL change" or "vill=1 on vl change" during vsetvl{i} x0, x0

a) "vill=1 on vl change"

b) "vill=1 on SEW/LMUL change"

Proposal a) The "vill=1 on vl change" form supports additional
functionality. The implied read of vl is a RAW dependency that
microarchitectures have to either resolve ahead of execution, or
speculate that vl doesn't change and flush on mispredict. It's not
clear to me when this additional functionality is useful, as it
overlaps with fractional LMUL functionality, but possibly when it is
known application vectors would fit into non-power-of-2 vector register

Proposal b) The "vill=1 on SEW/LMUL" change avoids a read of current vl
but limits vtype changes to constant SEW/LMUL ratios.

The current plan of record is option z)
z) "vill=1 only on bad new SEW/LMUL", which allows vl to change without
reporting vill.

vl zeroing on vill

c) any time vill is set, vl is zeroed.

d) vl never changes even if vtype.vill is set in vsetvl{i} x0, x0. "vsetvl
rd, xavl, xnewvtype" form writes vl with LSBs of xavl when
xnewvtype.vill=1 (otherwise as before). These instruction forms could
be renamed to "vsetvtype{i}" to make this distinction clearer.

Proposal c) is simpler conceptually. In particular, if a SEW/LMUL
configuration is not supported, then no matter which instruction form
is used to set vtype, vill will be set and vl zeroed. But a) requires
uarchs also zero vl for requested vl changes on "vsetvl{} x0, x0,".

Proposal d) adds a little complexity to vsetvl form of instruction,
but there is already a path to write vl from xvl, so vill=1 case would
be same as VLMAX>=xavl case (I think this means the "emulate shorter
VLEN machine" mechanism drops out of the same path). The conceptual
complexity (admittedly, with not much practical impact) is that a bad
SEW/LMUL setting can set vill but not change vl depending on form, but
changing assembly instruction name should make this easier to explain.


a) & c) vl read / vl written

a) & d) vl not read / vl written

b) & c) vl read / vl not written

b) & d) vl not read / vl not written


Join { to automatically receive all group messages.