Slidedown overlapping of dest and source regsiters


Thang Tran
 

The slideup instruction has this restriction:

The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.

The slidedown instruction has different restriction:

The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.

The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.

Thanks, Thang


Andrew Waterman
 

It's important that the slidedown instruction can overwrite its source operand.  Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.

On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:
The slideup instruction has this restriction:

The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.

The slidedown instruction has different restriction:

The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.

The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.

Thanks, Thang




Thang Tran
 

Hi Andrew,

I do not understand your statement. Why is it important? Why is the difference with slideup?

 

The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.

 

The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.

 

Thanks, Thang

 

From: Andrew Waterman [mailto:andrew@...]
Sent: Tuesday, January 28, 2020 11:23 AM
To: Thang Tran <thang@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters

 

It's important that the slidedown instruction can overwrite its source operand.  Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.

 

On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:

The slideup instruction has this restriction:

The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.

The slidedown instruction has different restriction:

The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.

The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.

Thanks, Thang


Guy Lemieux
 

Hi Thang,

I think Andrew is suggesting that the vslideup restriction is there to
allow some flexibility with implementations. However, one of
(vslideup/vslidedown) needs to allow the same source/dest register
(group) because the debugger is going to use this feature to inject
new data without clobbering other vector registers.

I believe most implementations iterating over a vector will be
incrementing the element index -- this allows vslidedown to safely
clobber earlier elements (higher index values are being read out while
lower index values are being written, so the lower index values will
have been previously read and the elements are in-transit in the
pipeline). If your vector implementation is decrementing the element
index, then you couldn't allow src/dst overlap with vslidedown, but
you could allow it with vslideup. Hence, there is an implicit
assumption here about implementations (ie, count up is preferred, or
else you have to buffer the whole vector register group).

I'm not sure how the debugger would be using this feature, but if I
had to guess, I think the debugger would actually be using vslide1down
(not vslidedown) to inject data into a vector. So, perhaps the
overlapping src/dst requirement should only be for vslide1down? Also,
as an alternative, there are also various vmv instructions that could
be used by the debugger which move one element at a time and do allow
overlapping src/dst. I don't think debugger performance is crucial.

Guy

On Tue, Jan 28, 2020 at 12:42 PM Thang Tran <thang@...> wrote:

Hi Andrew,

I do not understand your statement. Why is it important? Why is the difference with slideup?



The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.



The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.



Thanks, Thang



From: Andrew Waterman [mailto:andrew@...]
Sent: Tuesday, January 28, 2020 11:23 AM
To: Thang Tran <thang@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters



It's important that the slidedown instruction can overwrite its source operand. Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.



On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:

The slideup instruction has this restriction:

The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.

The slidedown instruction has different restriction:

The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.

The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.

Thanks, Thang


Thang Tran
 

Thanks Guy for the explanation, but my implementation is both incrementing element index for slideup and decrementing element index for slidedown (which is symmetrical implementation and simplest from my point of view).

I have no issue with dest/source registers overlapping for slide1down and slide1up. As you suggested can be used for debugging.

Thanks, Thang

-----Original Message-----
From: Guy Lemieux [mailto:glemieux@...]
Sent: Tuesday, January 28, 2020 1:40 PM
To: Thang Tran <thang@...>
Cc: Andrew Waterman <andrew@...>; Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters

Hi Thang,

I think Andrew is suggesting that the vslideup restriction is there to allow some flexibility with implementations. However, one of
(vslideup/vslidedown) needs to allow the same source/dest register
(group) because the debugger is going to use this feature to inject new data without clobbering other vector registers.

I believe most implementations iterating over a vector will be incrementing the element index -- this allows vslidedown to safely clobber earlier elements (higher index values are being read out while lower index values are being written, so the lower index values will have been previously read and the elements are in-transit in the pipeline). If your vector implementation is decrementing the element index, then you couldn't allow src/dst overlap with vslidedown, but you could allow it with vslideup. Hence, there is an implicit assumption here about implementations (ie, count up is preferred, or else you have to buffer the whole vector register group).

I'm not sure how the debugger would be using this feature, but if I had to guess, I think the debugger would actually be using vslide1down (not vslidedown) to inject data into a vector. So, perhaps the overlapping src/dst requirement should only be for vslide1down? Also, as an alternative, there are also various vmv instructions that could be used by the debugger which move one element at a time and do allow overlapping src/dst. I don't think debugger performance is crucial.

Guy


On Tue, Jan 28, 2020 at 12:42 PM Thang Tran <thang@...> wrote:

Hi Andrew,

I do not understand your statement. Why is it important? Why is the difference with slideup?



The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.



The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.



Thanks, Thang



From: Andrew Waterman [mailto:andrew@...]
Sent: Tuesday, January 28, 2020 11:23 AM
To: Thang Tran <thang@...>
Cc: Krste Asanovic <krste@...>;
tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest
and source regsiters



It's important that the slidedown instruction can overwrite its source operand. Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.



On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:

The slideup instruction has this restriction:

The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.

The slidedown instruction has different restriction:

The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.

The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.

Thanks, Thang


Guy Lemieux
 

Thanks Guy for the explanation, but my implementation is both incrementing element index for slideup and decrementing element index for slidedown (which is symmetrical implementation and simplest from my point of view).
I'm curious why you chose to be symmetrical (no need), and why you
decided incrementing for slideup decrementing for slidedn (I would do
the opposite).

By incrementing for vslidedown, and decrementing for vslideup, it
eliminates the race condition in both directions and allows
overlapping src/dst for both.

However, by supporting both incrementing and decrementing, you are
adding extra hardware that isn't strictly necessary.

Guy


Andrew Waterman
 



On Tue, Jan 28, 2020 at 1:40 PM Guy Lemieux <glemieux@...> wrote:
Hi Thang,

I think Andrew is suggesting that the vslideup restriction is there to
allow some flexibility with implementations. However, one of
(vslideup/vslidedown) needs to allow the same source/dest register
(group) because the debugger is going to use this feature to inject
new data without clobbering other vector registers.

I believe most implementations iterating over a vector will be
incrementing the element index -- this allows vslidedown to safely
clobber earlier elements (higher index values are being read out while
lower index values are being written, so the lower index values will
have been previously read and the elements are in-transit in the
pipeline). If your vector implementation is decrementing the element
index, then you couldn't allow src/dst overlap with vslidedown, but
you could allow it with vslideup. Hence, there is an implicit
assumption here about implementations (ie, count up is preferred, or
else you have to buffer the whole vector register group).

I'm not sure how the debugger would be using this feature, but if I
had to guess, I think the debugger would actually be using vslide1down
(not vslidedown) to inject data into a vector. So, perhaps the
overlapping src/dst requirement should only be for vslide1down? Also,
as an alternative, there are also various vmv instructions that could
be used by the debugger which move one element at a time and do allow
overlapping src/dst. I don't think debugger performance is crucial.

Oops, yes, I meant vslide1down.

Using vslide1down isn't about performance; it's the only way I know of for the debugger to construct a vector without additional storage.  The alternative would have been to add an instruction to insert an element into an arbitrary element position, which for various reasons was deemed a less-preferable alternative.


Guy


On Tue, Jan 28, 2020 at 12:42 PM Thang Tran <thang@...> wrote:
>
> Hi Andrew,
>
> I do not understand your statement. Why is it important? Why is the difference with slideup?
>
>
>
> The slideup cannot clobber the source operand with destination operand because the destination register writes to source register before the source operand is read.
>
>
>
> The slidedown instruction should be the same because my implementation would writes to the source register before the source operand is read. The allowed overlapping of source & destination registers assumes a certain implementation of slidedown which is not good for other people.
>
>
>
> Thanks, Thang
>
>
>
> From: Andrew Waterman [mailto:andrew@...]
> Sent: Tuesday, January 28, 2020 11:23 AM
> To: Thang Tran <thang@...>
> Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
> Subject: Re: [RISC-V] [tech-vector-ext] Slidedown overlapping of dest and source regsiters
>
>
>
> It's important that the slidedown instruction can overwrite its source operand.  Debuggers will use this feature to populate a vector register in-place without clobbering other architectural state.
>
>
>
> On Tue, Jan 28, 2020 at 10:59 AM Thang Tran <thang@...> wrote:
>
> The slideup instruction has this restriction:
>
> The destination vector register group for vslideup cannot overlap the source vector register group or the mask register, otherwise an illegal instruction exception is raised.
>
> The slidedown instruction has different restriction:
>
> The destination vector register group cannot overlap the mask register if LMUL>1, otherwise an illegal instruction exception is raised.
>
> The overlapping of the source and destination registers assumes the implementation to be in a certain way which is inflexible. I think that the slidedown instruction should have the same restriction of non-overlapping of source and destination registers.
>
> Thanks, Thang
>