chapter 7.8. Vector Load/Store Segment Instructions
Alexander Podoplelov
Hello! I have a question about vector segment load and stores. In table 14 we have NFIELDS from 1 to 8. In paragraphes 7.8.1-3 we have format like vlseg<nf>e<eew>.v vd, (rs1), vm vsseg<nf>e<eew>.v vs3, (rs1), vm From specification it is not clear for me Is it possible to have instruction like vlseg1e8.v
vd, (rs1), vm This question is about all vector segment load and stores. Right now assembly do not know opcodes for these instructions. Despite of there is no any sense of using vlseg1e8.v
vd, (rs1), vm (please, correct me if I wrong) I suppose it is need
to be noted somewhere about supporting / not supporting these
opcodes.
|
|
about masked-off bits for instructions vmsbf.m, vmsif.m, vmsof.m
#defines
lilei2@...
Hi,
I have a question about masked-off bits. I am not sure what is the behavior of destination inactive masked-off bits for instructions vmsbf.m, vmsif.m, vmsof.m. Does the "xxxx" means we can fill any value to these bits, regardless of vtype.vma? I copied the example codes below, which is from section 15.4 of RVV spec 1.0 frozen:
1 1 0 0 0 0 1 1 v0 vcontents
1 0 0 1 0 1 0 0 v3 contents
vmsbf.m v2, v3, v0.t
0 1 x x x x 1 1 v2 contents
In addition, whether all mask-result instructions need to fill the mask-off bits according to the vtype.vma policy, such as vector integer compare instructions?
And is it allowed that the implementation choose to only support mask-agnostic and tail-agnostic for mask-result instructions? Thanks.
|
|
Re: The Width of vcsr and vstart
Thanks for spotting the oversight.
toggle quoted messageShow quoted text
The spec was updated to indicate these should be treated as XLEN-bit wide registers. There is no effective difference right now given that upper bits are not currently defined, but there may be some use for >32 bits in vcsr in some distant future. Vstart could also acquire extra exception state in some distant future. Krste
|
|
Re: [RISC-V] [tech-unprivileged] [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
Allen Baum
IF the Trap-on-masked-fflags op isn't executed often, then a 4 instruction sequence (CSRRD FFLAGS, ANDI, BNE, .+4, ECALL) would do that, so there is a workaround. IF that has a performance impact, then it argues that you may need actual trapping behavior.
On Fri, Dec 17, 2021 at 5:05 PM ghost <ghost@...> wrote: > I’d suggest identifying important use cases for this. I’d also be looking at
|
|
Re: [RISC-V] [tech-unprivileged] [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
I'll note that ARM appears to detect tininess before rounding, while
x86 does so after rounding. Also, current ARM compilers don't support exception trapping on AArch64. https://developer.arm.com/documentation/dui0808/a/floating-point-support/exception-types-recognized-by-the-arm-floating-point-environment These decisions would not seem to match an intent by ARM to emulate x86 FP behavior to ease porting. Krste | I’d suggest identifying important use cases for this. I’d also be looking at software techniques where the compiler inserts checks to provide the necessary support for theOn Fri, 17 Dec 2021 16:56:58 -0800, Zalman Stern <zalman@...> said: | use cases first. | Probably the number one use case is a software emulator for x86 binaries on RISC V. (Because one has to provide the exact x86 behavior regardless of whether it is a strong | requirement for significant applications.) This alone could have driven things for ARM. The way to investigate would be to look at how Apple’s emulator works. | Glancing at the large corpus of code one can search at Google, yeah, there's enough stuff claiming a SIGFPE is going to happen in certain circumstances that floating point | exceptions can't be written off. But most of it looks like stuff that would far better be handled by having the compiler check a hardware provided flag and raise the | exception rather than having hardware do everything. (It is mostly stuff that is providing some fairly widely used, non-HPC, mathematical functionality and trying to ensure | a program crashes when numerical invariants are violated.) | My first thought was to ask why one would want this at all as I've done a fair bit of signal-processing/HPC-ish work in shipping applications and floating-point exceptions | are only ever used as a debugging tool. Generally most of my interaction with the feature has been fixing performance and correctness issues when floating-point exceptions | inadvertently get enabled. | -Z- | -Z- | On Fri, Dec 17, 2021 at 3:46 PM Bruce Hoult <bruce@...> wrote: | On Sat, Dec 18, 2021 at 9:09 AM Earl Killian <earl.killian@...> wrote: | The question I have is whether having this in scalar only would be sufficient? If porting an application were to need exception traps, it seems plausible to disable | compiler vectorization. | The MIPS patent should have expired by now, so it would solve the problem (except for inexact) on a simple in-order core. Does anyone know if x86 code uses inexact | traps? | What's the use-case for trapping on inexact (or even caring about it) in FP? Using doubles as 53 bit integers? I did that myself in accounting software back in the 80s | and 90s, but it's a bit pointless on a 64 bit machine. |
|
|
Re: [RISC-V] [tech-unprivileged] [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
ghost
I’d suggest identifying important use cases for this. I’d also be looking atAlong with this, I'd suggest considering an extension that consists of just one instruction: trap if (FP flags & mask in instruction) is non-zero. I'm not a hardware designer, but it seems to me that this would allow floating-point computation to run at full speed until a point selected by the programmer or compiler where a precise trap was needed, and the more instructions the compiler can place between the FP computation and the conditional trap, the less likely a pipeline stall. -- L Peter Deutsch <ghost@...> :: Aladdin Enterprises :: Healdsburg, CA Was your vote really counted? http://www.verifiedvoting.org
|
|
Re: [RISC-V] [tech-unprivileged] [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
Zalman Stern
I’d suggest identifying important use cases for this. I’d also be looking at software techniques where the compiler inserts checks to provide the necessary support for the use cases first. Probably the number one use case is a software emulator for x86 binaries on RISC V. (Because one has to provide the exact x86 behavior regardless of whether it is a strong requirement for significant applications.) This alone could have driven things for ARM. The way to investigate would be to look at how Apple’s emulator works. Glancing at the large corpus of code one can search at Google, yeah, there's enough stuff claiming a SIGFPE is going to happen in certain circumstances that floating point exceptions can't be written off. But most of it looks like stuff that would far better be handled by having the compiler check a hardware provided flag and raise the exception rather than having hardware do everything. (It is mostly stuff that is providing some fairly widely used, non-HPC, mathematical functionality and trying to ensure a program crashes when numerical invariants are violated.) My first thought was to ask why one would want this at all as I've done a fair bit of signal-processing/HPC-ish work in shipping applications and floating-point exceptions are only ever used as a debugging tool. Generally most of my interaction with the feature has been fixing performance and correctness issues when floating-point exceptions inadvertently get enabled. -Z- -Z-
On Fri, Dec 17, 2021 at 3:46 PM Bruce Hoult <bruce@...> wrote:
|
|
Re: [RISC-V] [tech-unprivileged] [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
On Sat, Dec 18, 2021 at 9:09 AM Earl Killian <earl.killian@...> wrote:
What's the use-case for trapping on inexact (or even caring about it) in FP? Using doubles as 53 bit integers? I did that myself in accounting software back in the 80s and 90s, but it's a bit pointless on a 64 bit machine.
|
|
Re: [RISC-V] [tech-unprivileged] [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
Earl Killian
The question I have is whether having this in scalar only would be sufficient? If porting an application were to need exception traps, it seems plausible to disable compiler vectorization.
toggle quoted messageShow quoted text
The MIPS patent should have expired by now, so it would solve the problem (except for inexact) on a simple in-order core. Does anyone know if x86 code uses inexact traps?
|
|
Re: [RISC-V] [tech-unprivileged] [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
Yes, this would be the obvious path to take. Some use cases, including maybe this one, might prefer FP traps to be horizontal into user mode.
But FP trap handling is too non-standard/heavyweight/buggy to get widespread use in portable code (along with other corners of IEEE FP spec), so agree it doesn’t seem to be on path to RVA mandate. Krste
|
|
Re: [EXT] Re: [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
Jeff Scott
Completely agree. Was very happy RISC-V did not include FPU exceptions.
Jeff
From: tech-vector-ext@... <tech-vector-ext@...>
On Behalf Of Andrew Waterman via lists.riscv.org
Sent: Friday, December 17, 2021 1:48 PM To: Ken Dockser <kad@...> Cc: tech-alternate-fp@...; tech-unprivileged@...; tech-vector-ext@... Subject: [EXT] Re: [RISC-V] [tech-vector-ext] FP Trapped exceptions needed for portability
Caution: EXT Email Defining a standard extension that provides precise traps on FP exceptions seems like a reasonable thing to do, if only to facilitate the use case you mention in a standard way. The strategy would presumably be to add another five bits to the fcsr that indicate which exceptions will raise traps.
But I’ll also briefly remark that not requiring traps on FP exceptions has been a godsend for implementing high-performance in-order cores, where data-dependent traps would preclude early retirement and deferred execution of these instructions. So there’s good reason never to make such an extension mandatory, even in the RVA profiles.
On Fri, Dec 17, 2021 at 11:31 AM Ken Dockser <kad@...> wrote:
|
|
Re: FP Trapped exceptions needed for portability
Andrew Waterman
Defining a standard extension that provides precise traps on FP exceptions seems like a reasonable thing to do, if only to facilitate the use case you mention in a standard way. The strategy would presumably be to add another five bits to the fcsr that indicate which exceptions will raise traps. But I’ll also briefly remark that not requiring traps on FP exceptions has been a godsend for implementing high-performance in-order cores, where data-dependent traps would preclude early retirement and deferred execution of these instructions. So there’s good reason never to make such an extension mandatory, even in the RVA profiles.
On Fri, Dec 17, 2021 at 11:31 AM Ken Dockser <kad@...> wrote:
|
|
FP Trapped exceptions needed for portability
While I understand that it had been decided long ago (relatively speaking) that RISC-V would not support trapping on floating-point exceptions, I am wondering if we need to revisit this. I have heard that ARM's rationale for adding floating-point exception trap capabilities in ARMv7.8 was not because of an inherent need for new code, but for enabling the efficient porting of X86 code to ARM. Does anyone out there have any experience with porting X86 code to RISC-V? Has the lack of trapped FP exceptions hindered such porting? Likewise, is there an interest in proposing a TG to create an extension that adds FP trap capabilities to Scalar and Vector FP. Thanks, Ken
|
|
Re: The Width of vcsr and vstart
Andrew Waterman
For the current V extension, it's correct to treat both vcsr and vstart as 32-bit registers. I agree the spec should clearly indicate whether or not these registers will always be 32 bits (like fcsr).
On Wed, Dec 15, 2021 at 6:31 PM Tianyi Xia via lists.riscv.org <tianshi.xty=alibaba-inc.com@...> wrote:
|
|
The Width of vcsr and vstart
Tianyi Xia <tianshi.xty@...>
Hi,all The fcsr defined in RISC-V Unprivileged ISA is fixed-length 32bit CSR. The register structure of vcsr is similar to fcsr.So maybe vcsr should also be defined as a fixed-length 32bit register?
Thanks,
Tianyi Xia
|
|
Re: Vector Memory Ordering
| Reviving this old thread with a question and a suggestion:On Mon, 13 Dec 2021 12:09:54 -0800, "Ken Dockser" <kad@...> said: | Question: What is the use case for supporting non-idempotent memory in a RISC-V Vector implementation as a part of a general-purpose | rich-OS application processor? While I can certainly envision embedded applications that use non-idempotent memory, it seems unlikely | that non-idempotent memory would be used when running arbitrary application code. To reduce overhead, some embedded Linux systems allow user-mode code to access devices directly, e.g., for dpdk networking. If anything, there is a trend to support more user-level access to devices/accelerators to reduce overhead, and to provide more isolation between tasks (as opposed to shared device driver in kernel). Such devices might have non-idempotent memory regions. | Suggestion: The current Vector specification's comments about supporting non-idempotent memory can easily mislead one into thinking | that such support is required in all compliant implementations. We need an explicit clarification in the specification along the | lines of "Vector extension support for handling non-idempotent memory accesses is not required in implementations that prohibit or | otherwise prevent (e.g., by trapping) such accesses." While my suggested sentence would likely benefit from some wordsmithing, I | think that what I am trying to convey is essential in defining what is architecturally required. We could add some non-normative text as a note to implementers, but this allowance just follows from the general RISC-V concept that certain memory address ranges only support certain operations (PMAs). Calling this out as a special case in the spec could then require repeating the statement all throughout all memory instructions for consistency, to avoid questions about why some instructions have optional support for some memory types versus others. We try to factor out these concepts in the spec. Krste | Thanks, | Ken |
|
|
Re: RVV assembler and simulation
Jim Wilson
On Mon, Dec 13, 2021 at 9:34 AM Peter Lieber <peteralieber@...> wrote: I am working on some experiment and I need to simulate RVV r1.0. Is spike my best bet for this? Upstream maniline binutils had rvv 1.0 support added recently. This isn't in any release yet, so you have to use the top of the development tree. Upstream LLVM has had rvv support for a while, tracking the evolving rvv draft. I don't follow llvm so I don't know the current state. rvv support is certainly there, but I don't know what version they are at. If they have rvv 1.0 support, which is likely, then it is probably only in the development tree and not in the last release. Yes, spike has rvv support. Again, check the rvv version. There have been a lot of different incompatible draft versions and if you accidentally mix tools that support different drafts it won't work. I would guess that spike has rvv 1.0 support but don't track it so don't know for sure. Jim
|
|
Re: Vector Memory Ordering
Reviving this old thread with a question and a suggestion:
Question: What is the use case for supporting non-idempotent memory in a RISC-V Vector implementation as a part of a general-purpose rich-OS application processor? While I can certainly envision embedded applications that use non-idempotent memory, it seems unlikely that non-idempotent memory would be used when running arbitrary application code. Suggestion: The current Vector specification's comments about supporting non-idempotent memory can easily mislead one into thinking that such support is required in all compliant implementations. We need an explicit clarification in the specification along the lines of "Vector extension support for handling non-idempotent memory accesses is not required in implementations that prohibit or otherwise prevent (e.g., by trapping) such accesses." While my suggested sentence would likely benefit from some wordsmithing, I think that what I am trying to convey is essential in defining what is architecturally required. Thanks, Ken
|
|
RVV assembler and simulation
I am working on some experiment and I need to simulate RVV r1.0. Is spike my best bet for this?
All I want to start with is writing bare metal assembly, and copy some memory buffers between the sim and host. Is there an assembler available that support RVV 1.0?
|
|
Re: RISC-V Vector Extension post-public review updates - fault flagging
On Thu, Nov 18, 2021 at 5:07 PM David Horner <ds2horner@...> wrote:
Simple implementations could simply execute it sequentially. Or have the trap handler execute the loads sequentially if any of them fault.
We don't care how slowly malicious code runs.
It plays just as well as any gather does, in the absence of faults. Faulting is very much NOT expected behaviour. You're probably about to terminate the program anyway, or drop into the debugger. The main requirement is that the user can see which iteration of their loop would have failed if the code had been left as scalar instructions instead of auto-vectorised.
One region, but it could be many page table entries, given sufficiently long vector registers -- up to 17 with 65536 bit VLEN and LMUL=8.
|
|