The RISC-V V TG have the plan to support a lowcost vector extension in RVMxx profile?
发送时间: 2021年6月3日 2:13
收件人: Guy Lemieux <guy.lemieux@...
抄送: Andrew Waterman <andrew@...
>; Tariq Kurd <tariq.kurd@...
>; Shaofei (B) <shaofei1@...
主题: Re: [RISC-V] [tech-vector-ext] Smaller embedded version of the Vector extension
The VLEN>=128 constraint is only for the application processor "V"
extension for the app profile - not for embedded vectors which can have VLEN=32.
From spec Introduction:
The term base vector extension is used informally to describe the standard set of vector ISA components that will be required for the single-letter "V" extension, which is intended for use in standard server and application-processor platform profiles. The set of mandatory instructions and supported element widths will vary with the base ISA (RV32I, RV64I) as described below.
Other profiles, including embedded profiles, may choose to mandate only subsets of these extensions. The exact set of mandatory supported instructions for an implementation to be compliant with a given profile will only be determined when each profile spec is ratified. For convenience in defining subset profiles, vector instruction subsets are given ISA string names beginning with the "Zv" prefix.
There are a set Zve* names for the embedded subsets (see github issue #550).
A minimal embedded implementaton using RV32E+Zfinx+vectors would be same state size as ARM MVE.
P extension does not have floating-point, but for short integer/fixed-point SIMD makes sense as alternative.
The software fragmentation issue is that some library routines that expose VLEN might not be portable between app cores and embedded cores, but these are different software ecosystems (e.g. ABI/calling convention might be different) and only a few kinds of routine rely on VLEN.
For app cores that can afford VLEN>=128, the advantage is the removal of stripmining code in cases that operate on fixed-size vectors.
On Wed, 2 Jun 2021 05:10:32 -0700, "Guy Lemieux" <guy.lemieux@...> said:
| Allowing VLEN<128 would allow for smaller vector register files, bit
| it would also result in a profile that is not forward-compatible with
| the V spec. This would produce another fracture the software ecosystem.
| To avoid such a fracture, there are two choices:
| (1) go with P instead
| (2) relax the V spec to allow smaller implementations
| So the key question for this group is whether to relax the minimum
| VLEN to 32 or 64?
| note: a possible justification for keeping 128 might be to recommend
| (1) instead. I don’t know anything about P, but it seems like it could
| be speced in a way that is competitive/comparable with Helium.
| PS — I have started to design an “RVV-lite” profile which would be
| more amenable to embedded implementations. However, I have adopted a
| stance that it must remain forward compatible with the full V spec, so
| I have not considered VLEN below 128. I am happy to share my work on
| this and involve other contributors — email me if you would like to see a copy.
| On Wed, Jun 2, 2021 at 3:15 AM Andrew Waterman <andrew@...
| The uppercase-V V extension is meant to cater to apps processors, where
| the VLEN >= 128 constraint is not inappropriate and is sometimes
| beneficial. But there's nothing fundamental about the ISA design that
| prohibits VLEN < 128. A minimal configuration is VLEN=ELEN=32, giving the
| same total amount of state as MVE. (And if you set LMUL=4, then you even
| get the same shape: 8 registers of 128 bits apiece.)
| Such a thing wouldn't be called V, but perhaps something like Zvmin.
| Other than agreeing on a feature set and assigning it a name, the
| architecting is already done.
| (If you search the spec for Zfinx, you'll see that a Zfinx variant is
| planned, but only barely sketched out.)
| On Wed, Jun 2, 2021 at 3:04 AM Tariq Kurd via lists.riscv.org
| Hi everyone,
| Are there any plans for a cut-down configuration of the vector
| extension suitable for embedded cores? It seems that the 32x128-bit
| register file is suitable for application class cores but it very
| large for embedded cores, especially if
| the F registers also need to be implemented (which I think is the
| case, unless a Zfinx version is specified).
| ARM MVE only has 8x128-bit registers for FP and Vector, so it much
| more suitable for embedded applications.
| What’s the approach here? Should embedded applications implement the
| P-extension instead?
| Tariq Kurd
| Processor Design
| I RISC-V Cores, Bristol
| Huawei technologies R&D (UK) Ltd
| I Address: 290
| Park Avenue, Aztec West, Almondsbury, Bristol, Avon, BS32
| 4TR, UK
| This e-mail and its attachments contain confidential information from
| HUAWEI, which
| is intended only for the person or entity whose address is listed
| above. Any use of the information contained herein in any way
| (including, but not limited to, total or partial
| disclosure,reproduction, or dissemination) by persons other than the
| intended recipient(s)
| is prohibited. If you receive this e-mail in error, please notify the
| sender by phone or email immediately and delete it !
| x[DELETED ATTACHMENT image001.png, PNG
| image] x[DELETED ATTACHMENT image002.jpg, JPEG image]