|
Re: vector strided stores when rs1=x0
These are all supported with ordered scatters/gathers to/from a single address.
We wanted to remove ordering requirements from all other vector load/store types.
Krste
These are all supported with ordered scatters/gathers to/from a single address.
We wanted to remove ordering requirements from all other vector load/store types.
Krste
|
By
Krste Asanovic
·
#507
·
|
|
Re: vector strided stores when rs1=x0
I think this is a bad idea for both loads and stores. If the intent is a single load or single store, then there should be another way to do it.
Using vector loads/stores with stride=0 is one way to
I think this is a bad idea for both loads and stores. If the intent is a single load or single store, then there should be another way to do it.
Using vector loads/stores with stride=0 is one way to
|
By
Guy Lemieux
·
#506
·
|
|
vector strided stores when rs1=x0
Also on github as issue #595
In our earlier TG discussion in 9/18 meeting, we were in favor of
allowing vector strided load instructions with rs1=x0 to perform fewer
memory accesses than the number
Also on github as issue #595
In our earlier TG discussion in 9/18 meeting, we were in favor of
allowing vector strided load instructions with rs1=x0 to perform fewer
memory accesses than the number
|
By
Krste Asanovic
·
#505
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
The key insight from the 21264 design is that the valid bits are cleared periodically. I appreciate that you were willing to meet me more than halfway, but I actually think your original idea would
The key insight from the 21264 design is that the valid bits are cleared periodically. I appreciate that you were willing to meet me more than halfway, but I actually think your original idea would
|
By
andrew@...
·
#504
·
|
|
reminder, Vector task group meeting Friday
We'll meet per the calendar entry.
Agenda is to go over any remaining unsettled open issues,
Krste
We'll meet per the calendar entry.
Agenda is to go over any remaining unsettled open issues,
Krste
|
By
Krste Asanovic
·
#503
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
On 11/5/20 10:51 PM, Bill Huffman wrote:
I see it's called a "stWait table" in the hardware reference manual.
I see how that works for waiting loads. I'm guessing that you're thinking of a PC based
On 11/5/20 10:51 PM, Bill Huffman wrote:
I see it's called a "stWait table" in the hardware reference manual.
I see how that works for waiting loads. I'm guessing that you're thinking of a PC based
|
By
Bill Huffman
·
#502
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
On 11/5/20 8:33 PM, Andrew Waterman wrote:
Do you know of a reference for how the store-wait predictor works? I can't find any reference to it in the 21264 hardware reference manual, though I found a
On 11/5/20 8:33 PM, Andrew Waterman wrote:
Do you know of a reference for how the store-wait predictor works? I can't find any reference to it in the 21264 hardware reference manual, though I found a
|
By
Bill Huffman
·
#501
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
I was thinking of something along the lines of the Alpha 21264's store-wait predictor: every N cycles (with N probably somewhere in the range of [2^10, 2^14]), clear the valid bits.
Gotcha. And while
I was thinking of something along the lines of the Alpha 21264's store-wait predictor: every N cycles (with N probably somewhere in the range of [2^10, 2^14]), clear the valid bits.
Gotcha. And while
|
By
andrew@...
·
#500
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
On 11/5/20 5:31 PM, Bill Huffman wrote:
The possibility of a second store (for which there's easy encoding space) would allow the bit to be set or cleared depending on which store instruction was
On 11/5/20 5:31 PM, Bill Huffman wrote:
The possibility of a second store (for which there's easy encoding space) would allow the bit to be set or cleared depending on which store instruction was
|
By
Bill Huffman
·
#499
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
On 11/5/20 4:36 PM, Andrew Waterman wrote:
How are you thinking that bit gets set/cleared? The same store instruction is used whether or not the compiler will be able to put in a hint.
I'm hoping
On 11/5/20 4:36 PM, Andrew Waterman wrote:
How are you thinking that bit gets set/cleared? The same store instruction is used whether or not the compiler will be able to put in a hint.
I'm hoping
|
By
Bill Huffman
·
#498
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
If your per-vector-register predictor works well to begin with, I would think you could extend it with a valid bit that indicates whether to use the prediction or the encoded hint, and it would
If your per-vector-register predictor works well to begin with, I would think you could extend it with a valid bit that indicates whether to use the prediction or the encoded hint, and it would
|
By
andrew@...
·
#497
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
On 11/5/20 3:35 PM, Andrew Waterman wrote:
My idea of a predictor is one predicted size per vector register (or maybe two or three sizes operating as a stack). Let's say for simplicity, we have both
On 11/5/20 3:35 PM, Andrew Waterman wrote:
My idea of a predictor is one predicted size per vector register (or maybe two or three sizes operating as a stack). Let's say for simplicity, we have both
|
By
Bill Huffman
·
#496
·
|
|
Re: Vector Byte Arrangement in Wide Implementations
I'm probably being obtuse, because you've surely already thought this through: if you can build an EEW predictor, why can't you build an ignore-the-encoded-EEW predictor? If you have a PC-indexed and
I'm probably being obtuse, because you've surely already thought this through: if you can build an EEW predictor, why can't you build an ignore-the-encoded-EEW predictor? If you have a PC-indexed and
|
By
andrew@...
·
#495
·
|
|
Vector Byte Arrangement in Wide Implementations
I've been thinking through the cases where a wide implementation that wants "slices" could have to introduce a hiccup to rearrange bytes because of an EEW change (since SLEN is gone). The ones I know
I've been thinking through the cases where a wide implementation that wants "slices" could have to introduce a hiccup to rearrange bytes because of an EEW change (since SLEN is gone). The ones I know
|
By
Bill Huffman
·
#494
·
|
|
Re: [RISC-V] [tech] [RISC-V] [tech-*] STRATEGIC FEATURE COEXISTANCE was:([tech-fast-int] usefulness of PUSHINT/POPINT from [tech-code-size])
Thanks Tim, I think that sums it up nicely.
I just wanted to put a pointer out to the original post that I made on isa-dev regarding opcode sharing /
Thanks Tim, I think that sums it up nicely.
I just wanted to put a pointer out to the original post that I made on isa-dev regarding opcode sharing /
|
By
Guy Lemieux
·
#493
·
|
|
Re: Sparse Matrix-Vector Multiply (again) and Bit-Vector Compression
Hi all,
If I use EDIV to compute SpMV y = A * x as size r * c blocks, I might have to load size r of y and size c of x, these are shorter than VL = r * c, is there an efficient way to do this by
Hi all,
If I use EDIV to compute SpMV y = A * x as size r * c blocks, I might have to load size r of y and size c of x, these are shorter than VL = r * c, is there an efficient way to do this by
|
By
lidawei14@...
·
#492
·
|
|
Re: Sparse Matrix-Vector Multiply (again) and Bit-Vector Compression
The email list has the archived discussion, which should include discussion of EDIV.
The main reason not to include in v1.0 is that it has many details to work through, and resolving these would delay
The email list has the archived discussion, which should include discussion of EDIV.
The main reason not to include in v1.0 is that it has many details to work through, and resolving these would delay
|
By
Krste Asanovic
·
#491
·
|
|
Re: Sparse Matrix-Vector Multiply (again) and Bit-Vector Compression
Hi all,
Thank you Nick for the reply.
I saw EDIV will not be included in v1.0, any issues to be resolved? Can I have a look at the discussion page on EDIV?
Thanks a lot,
Dawei
Hi all,
Thank you Nick for the reply.
I saw EDIV will not be included in v1.0, any issues to be resolved? Can I have a look at the discussion page on EDIV?
Thanks a lot,
Dawei
|
By
lidawei14@...
·
#490
·
|
|
Re: [RISC-V] [tech-*] STRATEGIC FEATURE COEXISTANCE was:([tech-fast-int] usefulness of PUSHINT/POPINT from [tech-code-size])
On 2020-10-26 12:48 a.m., Allen Baum wrote:
That is one approach. It is a consideration that has recently been mentioned wrt misa.
I remember Luke Kenneth Casson Leighton
On 2020-10-26 12:48 a.m., Allen Baum wrote:
That is one approach. It is a consideration that has recently been mentioned wrt misa.
I remember Luke Kenneth Casson Leighton
|
By
David Horner
·
#489
·
|
|
Re: [RISC-V] [tech-*] STRATEGIC FEATURE COEXISTANCE was:([tech-fast-int] usefulness of PUSHINT/POPINT from [tech-code-size])
My take: This is analogous to ascii(7-bit) and EBCIDIC(8-bit) both competing in the 8 bit byte addressable character space.
Initial solutions were fragmentation, then code pages
My take: This is analogous to ascii(7-bit) and EBCIDIC(8-bit) both competing in the 8 bit byte addressable character space.
Initial solutions were fragmentation, then code pages
|
By
David Horner
·
#488
·
|