|
[RISC-V] [tech-cmo] Fault-on-first should be allowed to return randomly on non-faults (also, running SIMT code on vector ISA)
https://github.com/riscv/riscv-v-spec/issues/587#issuecomment-711087236 To clarify, Andrew's reading of the spec has vstart>= vl behaviour superseding vl=0 implied behaviour. Thus some vector instruct
https://github.com/riscv/riscv-v-spec/issues/587#issuecomment-711087236 To clarify, Andrew's reading of the spec has vstart>= vl behaviour superseding vl=0 implied behaviour. Thus some vector instruct
|
By
David Horner
· #481
·
|
|
[RISC-V] [tech-cmo] Fault-on-first should be allowed to return randomly on non-faults (also, running SIMT code on vector ISA)
You're incorrectly characterizing FoF below. The FoF loads are not intended for software to dynamically probe the microarch state to check for possible faults That is not what I am advocating. (though
You're incorrectly characterizing FoF below. The FoF loads are not intended for software to dynamically probe the microarch state to check for possible faults That is not what I am advocating. (though
|
By
David Horner
· #477
·
|
|
[RISC-V] [tech-cmo] Fault-on-first should be allowed to return randomly on non-faults (also, running SIMT code on vector ISA)
Krste: I gather your answer is more in the context of lr/sc type forward guarantees, instructions that are designed not to trap when delivering on their primary function. So I agree that determining a
Krste: I gather your answer is more in the context of lr/sc type forward guarantees, instructions that are designed not to trap when delivering on their primary function. So I agree that determining a
|
By
David Horner
· #473
·
|
|
Sequence to insert an element
The mask could be built with sew=16, as the mask is ordinal based. And there are tricks to set it up, for example a direct load (register move) to v0 to set the correct bit. The mask could be built in
The mask could be built with sew=16, as the mask is ordinal based. And there are tricks to set it up, for example a direct load (register move) to v0 to set the correct bit. The mask could be built in
|
By
David Horner
· #461
·
|
|
[RISC-V] [tech-cmo] Fault-on-first should be allowed to return randomly on non-faults (also, running SIMT code on vector ISA)
The forward progress is to advance to another task. In the case of machine mode it can potentially "resolve" the cause of the vl=0 return and re-execute the loop (without the overhead of the trap). Th
The forward progress is to advance to another task. In the case of machine mode it can potentially "resolve" the cause of the vl=0 return and re-execute the loop (without the overhead of the trap). Th
|
By
David Horner
· #459
·
|
|
[RISC-V] [tech-cmo] Fault-on-first should be allowed to return randomly on non-faults (also, running SIMT code on vector ISA)
First I am very happy that "arbitrary decisions by the micro-architecture" allow reduction of vl to any [non-zero] value. Even if such appear "random". Exactly. I see this openness/lack of arbitrary c
First I am very happy that "arbitrary decisions by the micro-architecture" allow reduction of vl to any [non-zero] value. Even if such appear "random". Exactly. I see this openness/lack of arbitrary c
|
By
David Horner
· #457
·
|
|
[RISC-V] [tech-cmo] Fault-on-first should be allowed to return randomly on non-faults (also, running SIMT code on vector ISA)
I agree, however, it still does not answer the ISA visible behavioural question: "Is the trap allowed to set vl=0 on return?" Can this be compliant behaviour for certain platforms? If so, then it woul
I agree, however, it still does not answer the ISA visible behavioural question: "Is the trap allowed to set vl=0 on return?" Can this be compliant behaviour for certain platforms? If so, then it woul
|
By
David Horner
· #456
·
|
|
Vector TG meeting tomorrow
I will add my thoughts related to "embedded" imprecise: Why embedded specifically? Linux handles GPUs as coprocessors. My understanding is that by their nature, the internal state of most GPUs is not
I will add my thoughts related to "embedded" imprecise: Why embedded specifically? Linux handles GPUs as coprocessors. My understanding is that by their nature, the internal state of most GPUs is not
|
By
David Horner
· #443
·
|
|
Vector TG meeting tomorrow
Is there already a doc/issue specific to this imprecise handling that we can reference before and during the meeting?
Is there already a doc/issue specific to this imprecise handling that we can reference before and during the meeting?
|
By
David Horner
· #442
·
|
|
Proposing more portable vector cod
Never say never. Appears to be the mantra for V extension. Yes, the intent is that the V specification mandates LMUL of 8, 4 and 1. Even for minimal systems of VLEN=128; not only for interoperability,
Never say never. Appears to be the mantra for V extension. Yes, the intent is that the V specification mandates LMUL of 8, 4 and 1. Even for minimal systems of VLEN=128; not only for interoperability,
|
By
David Horner
· #440
·
|
|
Vector TG meeting minutes 2020/9/25
I am in favour of effectively weakening the scalar/vector vector/scalar load/load order requirement. However, this cannot be performed in isolation without regard to the rest of the RVWMO dependency r
I am in favour of effectively weakening the scalar/vector vector/scalar load/load order requirement. However, this cannot be performed in isolation without regard to the rest of the RVWMO dependency r
|
By
David Horner
· #434
·
|
|
Mask Register Value Mapping
You got my thumbs up! Definitely "something similar" and better that my more cryptic proposal. Thanks you Cohen for raising these concerns and Nick for moving this along so quickly.
You got my thumbs up! Definitely "something similar" and better that my more cryptic proposal. Thanks you Cohen for raising these concerns and Nick for moving this along so quickly.
|
By
David Horner
· #428
·
|
|
Mask Register Value Mapping
Or my preference a similar annotation that explicitly identifies it as a mast bit: vs2[i] + vs1[i] + v0[i].m Or similar.
Or my preference a similar annotation that explicitly identifies it as a mast bit: vs2[i] + vs1[i] + v0[i].m Or similar.
|
By
David Horner
· #425
·
|
|
poll on vstart management issues #493, #510 and #532
Ahead of the vector meeting I would like to see if we can address or at least get direction on some of the flagged for pre-v1.0 resolution. There are 3 related flagged issues that all deal with vstart
Ahead of the vector meeting I would like to see if we can address or at least get direction on some of the flagged for pre-v1.0 resolution. There are 3 related flagged issues that all deal with vstart
|
By
David Horner
· #409
·
|
|
Vector Memory Ordering
These issues discuss the need for order and reasons to have variants: https://github.com/riscv/riscv-v-spec/issues/501 Unordered Indexed Load https://github.com/riscv/riscv-v-spec/issues/502 Unordered
These issues discuss the need for order and reasons to have variants: https://github.com/riscv/riscv-v-spec/issues/501 Unordered Indexed Load https://github.com/riscv/riscv-v-spec/issues/502 Unordered
|
By
David Horner
· #396
·
|
|
Vector Memory Ordering
Just a quick note here, if Ztso is active then "truly in order" has global ordering too. Much of the discussion was driven by consideration for the difficulty in runtime overlap detection and challeng
Just a quick note here, if Ztso is active then "truly in order" has global ordering too. Much of the discussion was driven by consideration for the difficulty in runtime overlap detection and challeng
|
By
David Horner
· #395
·
|
|
Vector Memory Ordering
I tentatively agree *if* "truly in order" means "as if writes were the equivalent scalar writes of the register elements processed in order from 0 to vl" . However, RVWMO should be assumed and no furt
I tentatively agree *if* "truly in order" means "as if writes were the equivalent scalar writes of the register elements processed in order from 0 to vl" . However, RVWMO should be assumed and no furt
|
By
David Horner
· #394
·
|
|
GNU toolchain with RVV intrinsic support
Thank you for the clarification. Excellent.
Thank you for the clarification. Excellent.
|
By
David Horner
· #378
·
|
|
GNU toolchain with RVV intrinsic support
Thank you very much for this advancement. I have two concerns, in the body is a response. . Shouldn't this be vec_add32_rvv ? It is not a generalized vector add. > vint32m2_t va, vb, vc; The vector po
Thank you very much for this advancement. I have two concerns, in the body is a response. . Shouldn't this be vec_add32_rvv ? It is not a generalized vector add. > vint32m2_t va, vb, vc; The vector po
|
By
David Horner
· #376
·
|
|
VFRECIP/VFRSQRT instructions
This is a policy decision ensuring reasonable limits on fragmentation in this space. To be clear, this was a tactical decision to move forward with the proposal as presented. The proposal has not been
This is a policy decision ensuring reasonable limits on fragmentation in this space. To be clear, this was a tactical decision to move forward with the proposal as presented. The proposal has not been
|
By
David Horner
· #357
·
|