|
Re: Vector Memory Ordering
From most points of view it's a fine method (delay exception reporting until all accesses from logically previous addresses are complete and override). But it seems to require either that stores are
From most points of view it's a fine method (delay exception reporting until all accesses from logically previous addresses are complete and override). But it seems to require either that stores are
|
By
Bill Huffman
·
#401
·
|
|
Re: Vector Memory Ordering
If the memory ordering (in this case specifically memory access ordering, I think - correct me if I'm wrong) doesn't affect the final processor state (which would not be the case for a vector reduce
If the memory ordering (in this case specifically memory access ordering, I think - correct me if I'm wrong) doesn't affect the final processor state (which would not be the case for a vector reduce
|
By
Allen Baum
·
#400
·
|
|
Re: Vector Memory Ordering
Yes, that's what I intended.
Bill
Yes, that's what I intended.
Bill
|
By
Bill Huffman
·
#399
·
|
|
ordered vs unordered and overlaps use cases
what are the use cases? do we have examples in mind when they would/could be used?
are there examples of what developers would want from previous efforts on vector machines?
can we write them
what are the use cases? do we have examples in mind when they would/could be used?
are there examples of what developers would want from previous efforts on vector machines?
can we write them
|
By
mark
·
#398
·
|
|
Re: Vector Memory Ordering
Bill, great lists.
can we start building a testcase list for these situations and others? maybe a github doc? i am sure these will be documented but I don't want to lose bullet lists like this and
Bill, great lists.
can we start building a testcase list for these situations and others? maybe a github doc? i am sure these will be documented but I don't want to lose bullet lists like this and
|
By
mark
·
#397
·
|
|
Re: Vector Memory Ordering
These issues discuss the need for order and reasons to have variants:
https://github.com/riscv/riscv-v-spec/issues/501 Unordered Indexed Load
These issues discuss the need for order and reasons to have variants:
https://github.com/riscv/riscv-v-spec/issues/501 Unordered Indexed Load
|
By
David Horner
·
#396
·
|
|
Re: Vector Memory Ordering
Just a quick note here, if Ztso is active then "truly in order" has global ordering too.
Much of the discussion was driven by consideration for the difficulty in runtime overlap
Just a quick note here, if Ztso is active then "truly in order" has global ordering too.
Much of the discussion was driven by consideration for the difficulty in runtime overlap
|
By
David Horner
·
#395
·
|
|
Re: Vector Memory Ordering
I tentatively agree *if* "truly in order" means "as if writes were the equivalent scalar writes of the register elements processed in order from 0 to vl" .
However, RVWMO
I tentatively agree *if* "truly in order" means "as if writes were the equivalent scalar writes of the register elements processed in order from 0 to vl" .
However, RVWMO
|
By
David Horner
·
#394
·
|
|
Re: Vector Memory Ordering
This is not terribly straightforward.
I'll assume that the trap would only be a function of the stride and element/segment width, rather than checking that two active elements actually overlap at
This is not terribly straightforward.
I'll assume that the trap would only be a function of the stride and element/segment width, rather than checking that two active elements actually overlap at
|
By
Andrew Waterman
·
#393
·
|
|
Re: Vector Memory Ordering
i have not been following this thread in lots of detail
could someone please explain why we need to differentiate between ordered and unordered load/stores.
in the 6 or vector systems i have been
i have not been following this thread in lots of detail
could someone please explain why we need to differentiate between ordered and unordered load/stores.
in the 6 or vector systems i have been
|
By
swallach
·
#392
·
|
|
Re: Vector Memory Ordering
Guy Lemieux commented:
Maybe what's below could be improved by saying that if the base address (in src1) was non-idempotent or an "ordered channel," the entire instruction would run in order. If
Guy Lemieux commented:
Maybe what's below could be improved by saying that if the base address (in src1) was non-idempotent or an "ordered channel," the entire instruction would run in order. If
|
By
Bill Huffman
·
#391
·
|
|
Vector Memory Ordering
I think from this morning, we are considering:
Ordered scatters are done truly in order
Strided stores that overlap (including segmented ones) will trap as illegal
All other vector loads and stores do
I think from this morning, we are considering:
Ordered scatters are done truly in order
Strided stores that overlap (including segmented ones) will trap as illegal
All other vector loads and stores do
|
By
Bill Huffman
·
#390
·
|
|
Usual vector TG meeting today
Though I don’t know if we’re affected by calendar changes,
Krste
Though I don’t know if we’re affected by calendar changes,
Krste
|
By
Krste Asanovic
·
#389
·
|
|
Re: Signed v Unsigned Immediate: vsaddu.vi
Andrew, Nick,
Thank you for the quick responses. Nick, the text updates look like they directly reflect the intent.
-Cohen
Andrew, Nick,
Thank you for the quick responses. Nick, the text updates look like they directly reflect the intent.
-Cohen
|
By
CDS <cohen.steed@...>
·
#388
·
|
|
Re: Decompress Instruction
Thanks Krste, that makes sense but the logic is not that straight forward, people usually needs "decompress" when they are using "compress", maybe we can add some comment on this at the "vcompress"
Thanks Krste, that makes sense but the logic is not that straight forward, people usually needs "decompress" when they are using "compress", maybe we can add some comment on this at the "vcompress"
|
By
lidawei14@...
·
#387
·
|
|
Decompress Instruction
If the decompress is the inverse of compress, then there will be a
packed vector holding the non-zero elements and a bit mask indicating
which elements should receive the elements after unpacking
If the decompress is the inverse of compress, then there will be a
packed vector holding the non-zero elements and a bit mask indicating
which elements should receive the elements after unpacking
|
By
Krste Asanovic
·
#386
·
|
|
Decompress Instruction
Hi all,
For common AI workloads such as DNNs, data communications between network layers introduce huge pressure on capacity and bandwidth of the memory hierarchy.
For instance, dynamic large
Hi all,
For common AI workloads such as DNNs, data communications between network layers introduce huge pressure on capacity and bandwidth of the memory hierarchy.
For instance, dynamic large
|
By
lidawei14@...
·
#385
·
|
|
Re: EEW and non-indexed loads/stores
Correct,
Krste
By
Krste Asanovic
·
#384
·
|
|
EEW and non-indexed loads/stores
Hi all,
I understand the EEW, as explicitly encoded in the load/store instructions applies to the vector of indices for the indexed loads and stores. For instance we can load a vector "SEW=8,LMUL=1"
Hi all,
I understand the EEW, as explicitly encoded in the load/store instructions applies to the vector of indices for the indexed loads and stores. For instance we can load a vector "SEW=8,LMUL=1"
|
By
Roger Ferrer Ibanez
·
#383
·
|
|
Re: Signed v Unsigned Immediate: vsaddu.vi
Hi Cohen,
Thanks for your careful reading.
Hopefully this edit clarifies some of the ambiguity: https://github.com/riscv/riscv-v-spec/pull/565
Best,
Nick Knight
Hi Cohen,
Thanks for your careful reading.
Hopefully this edit clarifies some of the ambiguity: https://github.com/riscv/riscv-v-spec/pull/565
Best,
Nick Knight
|
By
Nick Knight
·
#382
·
|