Re: Vector Task Group minutes 2020/5/15

David Horner

I would agree that "by definition" this is a sufficient condition to obtain the instructions that Krste was envisioning of instructions the also nop on SLEN=VLEN machine.
That is a sufficient condition to address the byte mismatch of SLEN < VLEN.

However, is it necessary as it is a very expensive operation for SLEN<?

Are there casting instructions that are reasonably low cost on both SLEN= and SLEN< VLEN that create an intermediate state that works for both?

And if there are such operations, do you only provide them (and NOT the heavy handed "as if written to memory and back")?
Can two such instructions do the full transition for SLEN< to SLEN=?
If so, is it sufficiently easy to recognize such a pair and fuse as a nop on SLEN= systems?
Can applications alternatively rely on a linkage editor to nop them?

 I have no good solution (yet) as the guts of the range of microarch tricks is not my forte.

But there are others who undoubtedly are mulling over such considerations.

It would not be a lose-win proposition but a limited win-win.

I look forward to Krste's proposals . I have been surprised before!!

On 2020-05-16 2:07 a.m., Bill Huffman wrote:
It seems like the function of a cast instruction the same as storing to
memory (stride-1) with one SEW and loading back the same number of bytes
with another SEW. Is that a correct understanding?


On 5/15/20 11:55 AM, Krste Asanovic wrote:

Date: 2020/5/15
Task Group: Vector Extension
Chair: Krste Asanovic
Co-Chair: Roger Espasa
Number of Attendees: ~20
Current issues on github:;!!EHscmS1ygiU1lA!W3LXrGwuFwNIJ12NX5xQnmMbzk4zgzIDO39xVFEgrQGQSggvT8Zg9M2ElNRv61w$

Issues discussed:

# MLEN=1 change

The new layout of mask registers with fixed MLEN=1 was discussed. The
group was generally in favor of the change, though there is a proposal
in flight to rearrange bits to align with bytes. This might save some
wiring but could increase bits read/written for the mask in a

#434 SLEN=VLEN as optional extension

Most of the time was spent discussing the possible software
fragmentation from having code optimized for SLEN=LEN versus
SLEN<VLEN, and how to avoid. The group was keen to prevent possible
fragmentation, so is going to consider several options:

- providing cast instructions that are mandatory, so at least
SLEN<VLEN code runs correctly on SLEN=VLEN machines.

- consider a different data layout that could allow casting up to ELEN
(<=SLEN), however these appear to result in even greater variety of
layouts or dynamic layouts

- invent a microarchitecture that can appear as SLEN=VLEN but
internally restrict datapath communication within SLEN width of
datapath, or prove this is impossible/expensive

# v0.9

The group agreed to declare the current version of the spec as 0.9,
representing a clear stable step for software and implementors.

Join to automatically receive all group messages.