|
RISC-V Vector Task Group: fractional LMUL
Special lmul code in vsetvl{i} to derive LML from existing SEW/LMUL and provided sew code. As mentioned in the TG we suggested widen LMUL to 3 bits with 7 (explicit) states. I suggest we name them l1,
Special lmul code in vsetvl{i} to derive LML from existing SEW/LMUL and provided sew code. As mentioned in the TG we suggested widen LMUL to 3 bits with 7 (explicit) states. I suggest we name them l1,
|
By
David Horner
· #22
·
|
|
RISC-V Vector Task Group: fractional LMUL
I left the encoding unspecified in the proposal. That was intentional as I saw various tradeoffs. However, I now recommend the codes be in order of increasing VLMAX value as so: vlmul mnemonic LMUL VL
I left the encoding unspecified in the proposal. That was intentional as I saw various tradeoffs. However, I now recommend the codes be in order of increasing VLMAX value as so: vlmul mnemonic LMUL VL
|
By
David Horner
· #23
·
|
|
issue #393 - Towards a simple fractional LMUL design.
I'm sending out to the correct mailing list a copy of the revised issue #393. (link: https://github.com/riscv/riscv-v-spec/issues/393 ) This was requested at the last TG meeting. I believe it is consi
I'm sending out to the correct mailing list a copy of the revised issue #393. (link: https://github.com/riscv/riscv-v-spec/issues/393 ) This was requested at the last TG meeting. I believe it is consi
|
By
David Horner
· #33
·
|
|
issue #393 - Towards a simple fractional LMUL design.
My apologies, especially to those who have sent some feedback. I had thought I had already sent this second iteration (It has been on git hub issue since Monday. A slightly less simple design to parti
My apologies, especially to those who have sent some feedback. I had thought I had already sent this second iteration (It has been on git hub issue since Monday. A slightly less simple design to parti
|
By
David Horner
· #52
·
|
|
issue #393.3 - Towards a simple fractional LMUL design - third itteration .
I am sending out the partial description of the next itteration for the Simple Fractioanl LMUL design. It is incomplete because I only recently clarified in my own mind a means to represent the concep
I am sending out the partial description of the next itteration for the Simple Fractioanl LMUL design. It is incomplete because I only recently clarified in my own mind a means to represent the concep
|
By
David Horner
· #53
·
|
|
issue #393.4 - Towards a simple fractional LMUL design - third itteration .
This is the fourth installation of the Simple Fractioanl LMUL design. It does not yet address SLEN and a suggestion for a fractional SLEN (fracSLEN). It should however clarify the ramifications of the
This is the fourth installation of the Simple Fractioanl LMUL design. It does not yet address SLEN and a suggestion for a fractional SLEN (fracSLEN). It should however clarify the ramifications of the
|
By
David Horner
· #54
·
|
|
Fractional LMUL Constraint
Intriguing. I especially appreciate the effort to identify fundamental aspects of fractional LMUL. As you mention below this constraint is problematic when VLEN=ELEN, or a low multiple of VLEN. I appr
Intriguing. I especially appreciate the effort to identify fundamental aspects of fractional LMUL. As you mention below this constraint is problematic when VLEN=ELEN, or a low multiple of VLEN. I appr
|
By
David Horner
· #57
·
|
|
A simple fractional LMUL proposal
I attempt to summarize the needs here: 1) to reduce the register pressure that successive levels of LMUL invoke (halving register addresses each level) this arising from the need/desire 2) To replace
I attempt to summarize the needs here: 1) to reduce the register pressure that successive levels of LMUL invoke (halving register addresses each level) this arising from the need/desire 2) To replace
|
By
David Horner
· #69
·
|
|
Vector TG meeting minutes 2020/4/03
I agree Nick. So here is a suggestion, not completely facetiously: For load byte/half/word example when SEW = 64 An implementation can optimize the sequence strided load by 1/2/4 shift left 56/48/32 a
I agree Nick. So here is a suggestion, not completely facetiously: For load byte/half/word example when SEW = 64 An implementation can optimize the sequence strided load by 1/2/4 shift left 56/48/32 a
|
By
David Horner
· #80
·
|
|
Vector TG meeting minutes 2020/4/03
I agree, It's more important to gather and evaluate actual application kernels. Is there such an effort on-going? I further agree to the implicit idea that much, even most, of the processing in any gi
I agree, It's more important to gather and evaluate actual application kernels. Is there such an effort on-going? I further agree to the implicit idea that much, even most, of the processing in any gi
|
By
David Horner
· #88
·
|
|
intro to #421 Fractional vtype field vfill and #418 vlmt...
Previous issues I opened on fractional LMUL were exploratory, suggesting various ways to encode and enable the feature. The latest 4 issues opened on github are specific proposals based on the strawma
Previous issues I opened on fractional LMUL were exploratory, suggesting various ways to encode and enable the feature. The latest 4 issues opened on github are specific proposals based on the strawma
|
By
David Horner
· #89
·
|
|
Effective element width encoding in vector load/stores
What of SEW scaling factor instead. 1/4,1/2,1 and 2. This allows a much expanded dynamic range and addresses most scaling concerns. It allows of 2 * SEW for vwop.wv source load, and store of all widen
What of SEW scaling factor instead. 1/4,1/2,1 and 2. This allows a much expanded dynamic range and addresses most scaling concerns. It allows of 2 * SEW for vwop.wv source load, and store of all widen
|
By
David Horner
· #92
·
|
|
Effective element width encoding in vector load/stores
I believe I believe these widths are the appropriate ones. See the explanation above. The rationale being widening (and narrowing) instructions are already SEW based, so are single SEW instructions an
I believe I believe these widths are the appropriate ones. See the explanation above. The rationale being widening (and narrowing) instructions are already SEW based, so are single SEW instructions an
|
By
David Horner
· #96
·
|
|
RISC-V Vector TG meeting minutes, April 17, 2020
As I had asked that pagan question of opcode space I thought I should try to address the problem: I opened issue https://github.com/riscv/riscv-v-spec/issues/423 Which I paste here for your convenienc
As I had asked that pagan question of opcode space I thought I should try to address the problem: I opened issue https://github.com/riscv/riscv-v-spec/issues/423 Which I paste here for your convenienc
|
By
David Horner
· #98
·
|
|
make SEW be the largest element width
First thoughts below - as a response to #424 and - we need only consider SEW = LEW (new) and 1/2LEW (POR) When completed is it LEWd? With the corresponding lmul = lmul/2 + lmul/2 Which biases towards
First thoughts below - as a response to #424 and - we need only consider SEW = LEW (new) and 1/2LEW (POR) When completed is it LEWd? With the corresponding lmul = lmul/2 + lmul/2 Which biases towards
|
By
David Horner
· #100
·
|
|
make SEW be the largest element width
For those not following on github:
For those not following on github:
|
By
David Horner
· #102
·
|
|
[riscv/riscv-v-spec] the differing nature of LMUL > 1 and fractional LMUL (#382)
Thank you very much for this. I started a pull request, but was including as an extension and still debating the best way to incorporate. Would it be possible to generate a pdf for what is now a subst
Thank you very much for this. I started a pull request, but was including as an extension and still debating the best way to incorporate. Would it be possible to generate a pdf for what is now a subst
|
By
David Horner
· #103
·
|
|
Thoughts on Git update (8a9fbce) Added fractional LMUL, including modifying vector data register and vector mask register layouts for SLEN<VLEN implementations.
First some observations from the revised LMUL. *1 The format for a given SLEN and SEW is the same for all LMUL>=1 *2 LMUL=n is equivalent to LMUL=2 * n with vl < 1/2 vlmax at that level, for n=1,2,4.
First some observations from the revised LMUL. *1 The format for a given SLEN and SEW is the same for all LMUL>=1 *2 LMUL=n is equivalent to LMUL=2 * n with vl < 1/2 vlmax at that level, for n=1,2,4.
|
By
David Horner
· #104
·
|
|
[riscv/riscv-v-spec] the differing nature of LMUL > 1 and fractional LMUL (#382)
OK. Again thanks. I will generate a few versions and compare incrementally. Will report the anomalies if they show up in your weekend efforts.
OK. Again thanks. I will generate a few versions and compare incrementally. Will report the anomalies if they show up in your weekend efforts.
|
By
David Horner
· #106
·
|
|
Thoughts on Git update (8a9fbce) Added fractional LMUL, including modifying vector data register and vector mask register layouts for SLEN<VLEN implementations.
Yes it is. However, it needn't be, and allowing it to not be gives greater flexibility at minimal cost. This was a suggestion to implement #418. [Introduce vlmt (vl multiplicative threshold) / VLMT Ve
Yes it is. However, it needn't be, and allowing it to not be gives greater flexibility at minimal cost. This was a suggestion to implement #418. [Introduce vlmt (vl multiplicative threshold) / VLMT Ve
|
By
David Horner
· #108
·
|