Slidedown overlapping of dest and source regsiters
The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.
The slidedown instruction has different restriction:
The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.
The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.
Thanks, Thang
The slideup instruction has this restriction:
The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.
The slidedown instruction has different restriction:
The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.
The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.
Thanks, Thang
Hi Andrew,
I do not understand your statement. Why is it important? Why is the difference with slideup?
The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.
The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.
Thanks, Thang
Sent: Tuesday, January 28, 2020 11:23 AM
To: Thang Tran <thang@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters
It's important that the slidedown instruction can overwrite its source operand. Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.
On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:
The slideup instruction has this restriction:
The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.
The slidedown instruction has different restriction:
The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.
The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.
Thanks, Thang
I think Andrew is suggesting that the vslideup restriction is there to
allow some flexibility with implementations. However, one of
(vslideup/vslidedown) needs to allow the same source/dest register
(group) because the debugger is going to use this feature to inject
new data without clobbering other vector registers.
I believe most implementations iterating over a vector will be
incrementing the element index -- this allows vslidedown to safely
clobber earlier elements (higher index values are being read out while
lower index values are being written, so the lower index values will
have been previously read and the elements are in-transit in the
pipeline). If your vector implementation is decrementing the element
index, then you couldn't allow src/dst overlap with vslidedown, but
you could allow it with vslideup. Hence, there is an implicit
assumption here about implementations (ie, count up is preferred, or
else you have to buffer the whole vector register group).
I'm not sure how the debugger would be using this feature, but if I
had to guess, I think the debugger would actually be using vslide1down
(not vslidedown) to inject data into a vector. So, perhaps the
overlapping src/dst requirement should only be for vslide1down? Also,
as an alternative, there are also various vmv instructions that could
be used by the debugger which move one element at a time and do allow
overlapping src/dst. I don't think debugger performance is crucial.
Guy
Hi Andrew,
I do not understand your statement. Why is it important? Why is the difference with slideup?
The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.
The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.
Thanks, Thang
From: Andrew Waterman [mailto:andrew@...]
Sent: Tuesday, January 28, 2020 11:23 AM
To: Thang Tran <thang@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters
It's important that the slidedown instruction can overwrite its source operand. Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.
On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:
The slideup instruction has this restriction:
The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.
The slidedown instruction has different restriction:
The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.
The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.
Thanks, Thang
I have no issue with dest/source registers overlapping for slide1down and slide1up. As you suggested can be used for debugging.
Thanks, Thang
From: Guy Lemieux [mailto:glemieux@...]
Sent: Tuesday, January 28, 2020 1:40 PM
To: Thang Tran <thang@...>
Cc: Andrew Waterman <andrew@...>; Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters
Hi Thang,
I think Andrew is suggesting that the vslideup restriction is there to allow some flexibility with implementations. However, one of
(vslideup/vslidedown) needs to allow the same source/dest register
(group) because the debugger is going to use this feature to inject new data without clobbering other vector registers.
I believe most implementations iterating over a vector will be incrementing the element index -- this allows vslidedown to safely clobber earlier elements (higher index values are being read out while lower index values are being written, so the lower index values will have been previously read and the elements are in-transit in the pipeline). If your vector implementation is decrementing the element index, then you couldn't allow src/dst overlap with vslidedown, but you could allow it with vslideup. Hence, there is an implicit assumption here about implementations (ie, count up is preferred, or else you have to buffer the whole vector register group).
I'm not sure how the debugger would be using this feature, but if I had to guess, I think the debugger would actually be using vslide1down (not vslidedown) to inject data into a vector. So, perhaps the overlapping src/dst requirement should only be for vslide1down? Also, as an alternative, there are also various vmv instructions that could be used by the debugger which move one element at a time and do allow overlapping src/dst. I don't think debugger performance is crucial.
Guy
On Tue, Jan 28, 2020 at 12:42 PM Thang Tran <thang@...> wrote:
Hi Andrew,
I do not understand your statement. Why is it important? Why is the difference with slideup?
The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.
The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.
Thanks, Thang
From: Andrew Waterman [mailto:andrew@...]
Sent: Tuesday, January 28, 2020 11:23 AM
To: Thang Tran <thang@...>
Cc: Krste Asanovic <krste@...>;
tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest
and source regsiters
It's important that the slidedown instruction can overwrite its source operand. Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.
On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:
The slideup instruction has this restriction:
The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.
The slidedown instruction has different restriction:
The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.
The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.
Thanks, Thang
Thanks Guy for the explanation, but my implementation is both incrementing element index for slideup and decrementing element index for slidedown (which is symmetrical implementation and simplest from my point of view).I'm curious why you chose to be symmetrical (no need), and why you
decided incrementing for slideup decrementing for slidedn (I would do
the opposite).
By incrementing for vslidedown, and decrementing for vslideup, it
eliminates the race condition in both directions and allows
overlapping src/dst for both.
However, by supporting both incrementing and decrementing, you are
adding extra hardware that isn't strictly necessary.
Guy
Hi Thang,
I think Andrew is suggesting that the vslideup restriction is there to
allow some flexibility with implementations. However, one of
(vslideup/vslidedown) needs to allow the same source/dest register
(group) because the debugger is going to use this feature to inject
new data without clobbering other vector registers.
I believe most implementations iterating over a vector will be
incrementing the element index -- this allows vslidedown to safely
clobber earlier elements (higher index values are being read out while
lower index values are being written, so the lower index values will
have been previously read and the elements are in-transit in the
pipeline). If your vector implementation is decrementing the element
index, then you couldn't allow src/dst overlap with vslidedown, but
you could allow it with vslideup. Hence, there is an implicit
assumption here about implementations (ie, count up is preferred, or
else you have to buffer the whole vector register group).
I'm not sure how the debugger would be using this feature, but if I
had to guess, I think the debugger would actually be using vslide1down
(not vslidedown) to inject data into a vector. So, perhaps the
overlapping src/dst requirement should only be for vslide1down? Also,
as an alternative, there are also various vmv instructions that could
be used by the debugger which move one element at a time and do allow
overlapping src/dst. I don't think debugger performance is crucial.
Guy
On Tue, Jan 28, 2020 at 12:42 PM Thang Tran <thang@...> wrote:
>
> Hi Andrew,
>
> I do not understand your statement. Why is it important? Why is the difference with slideup?
>
>
>
> The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.
>
>
>
> The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.
>
>
>
> Thanks, Thang
>
>
>
> From: Andrew Waterman [mailto:andrew@...]
> Sent: Tuesday, January 28, 2020 11:23 AM
> To: Thang Tran <thang@...>
> Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
> Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters
>
>
>
> It's important that the slidedown instruction can overwrite its source operand. Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.
>
>
>
> On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:
>
> The slideup instruction has this restriction:
>
> The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.
>
> The slidedown instruction has different restriction:
>
> The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.
>
> The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.
>
> Thanks, Thang
>