Proposing more portable vector cod


Joseph Rahmeh <Joseph.Rahmeh@...>
 

 

In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)

 

For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.

 

This seems unduly restrictive and will limit software portability.  If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.

 

Similarly,  code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not.  LMUL2 would be the highest supported group multiplier for the given SEW.

 

 


Nick Knight
 

Hi Joseph,

Thanks for your comments. I apologize, but I don't fully understand your proposal, or the problem it solves. To help explain my confusion, here are two thoughts.

The supported LMUL (and EMUL) values are 2^k (k = -3:3) on all implementations, so software requesting EMUL > 8 is illegal everywhere.

I agree that we could make widening instructions more flexible by having them decrease VL (and LMUL) so that EMUL becomes valid. The fault-first loads adjust VL automatically, so this is not without some precedent. However, In my opinion, it's too much of a burden to do this manually (using vsetvli), and I don't see any portability issues with that.

Best,
Nick Knight


On Tue, Sep 29, 2020 at 9:32 AM Joseph Rahmeh <Joseph.Rahmeh@...> wrote:

 

In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)

 

For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.

 

This seems unduly restrictive and will limit software portability.  If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.

 

Similarly,  code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not.  LMUL2 would be the highest supported group multiplier for the given SEW.

 

 


Nick Knight
 

Sorry, in case it wasn't clear: typo

On Tue, Sep 29, 2020 at 11:30 AM Nick Knight <nick.knight@...> wrote:
However, In my opinion, it's too much of a burden to do this manually (using vsetvli),

it's not too much of a burden.

Best,
Nick Knight

On Tue, Sep 29, 2020 at 9:32 AM Joseph Rahmeh <Joseph.Rahmeh@...> wrote:

 

In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)

 

For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.

 

This seems unduly restrictive and will limit software portability.  If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.

 

Similarly,  code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not.  LMUL2 would be the highest supported group multiplier for the given SEW.

 

 


joseph.rahmeh@...
 

Hi Nick,

 

Thanks for the reply.  I was not asking for non-power of 2 LMUL.  I was asking about LMUL values not supported by some implementation.

 

Let’s say that for SEW=128, an implementation supports LMUL=1 but no other values of LMUL.  If the software ties to use a widening operation with SEW=64 and LMUL=1, then the EEW for the wide operand will be 128 and the EMUL will be 2.  The EEW/EMUL combination (128/2) is not supported on this implementation.  However, 128/1 is.  If we reduce VL instead of taking an illegal instruction exception, the code will work. 

 

Joe

 

From: Nick Knight <nick.knight@...>
Date: Tuesday, September 29, 2020 at 1:31 PM
To: Joseph Rahmeh <Joseph.Rahmeh@...>
Cc: "tech-vector-ext@..." <tech-vector-ext@...>, Robert Golla <Robert.Golla@...>, Cohen Steed <Cohen.Steed@...>, Christopher Olson <Christopher.Olson@...>, Matthew Smittle <Matthew.Smittle@...>, Ajay Ingle <Ajay.Ingle@...>
Subject: Re: [RISC-V] [tech-vector-ext] Proposing more portable vector cod

 

CAUTION: This email originated from outside of Western Digital. Do not click on links or open attachments unless you recognize the sender and know that the content is safe.

 

Hi Joseph,

 

Thanks for your comments. I apologize, but I don't fully understand your proposal, or the problem it solves. To help explain my confusion, here are two thoughts.

 

The supported LMUL (and EMUL) values are 2^k (k = -3:3) on all implementations, so software requesting EMUL > 8 is illegal everywhere.

 

I agree that we could make widening instructions more flexible by having them decrease VL (and LMUL) so that EMUL becomes valid. The fault-first loads adjust VL automatically, so this is not without some precedent. However, In my opinion, it's too much of a burden to do this manually (using vsetvli), and I don't see any portability issues with that.

 

Best,

Nick Knight

 

On Tue, Sep 29, 2020 at 9:32 AM Joseph Rahmeh <Joseph.Rahmeh@...> wrote:

 

In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)

 

For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.

 

This seems unduly restrictive and will limit software portability.  If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.

 

Similarly,  code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not.  LMUL2 would be the highest supported group multiplier for the given SEW.

 

 


Nick Knight
 

Hi Joseph,

Thanks for the clarification.

The wording in the spec is admittedly vague: "LMUL can have integer values 1,2,4,8.", etc. My understanding of the intent is that all implementations must support the full range of LMUL values.

I'll defer to others to confirm or deny this.

Best,
Nick


On Tue, Sep 29, 2020 at 11:42 AM Joseph Rahmeh <Joseph.Rahmeh@...> wrote:

Hi Nick,

 

Thanks for the reply.  I was not asking for non-power of 2 LMUL.  I was asking about LMUL values not supported by some implementation.

 

Let’s say that for SEW=128, an implementation supports LMUL=1 but no other values of LMUL.  If the software ties to use a widening operation with SEW=64 and LMUL=1, then the EEW for the wide operand will be 128 and the EMUL will be 2.  The EEW/EMUL combination (128/2) is not supported on this implementation.  However, 128/1 is.  If we reduce VL instead of taking an illegal instruction exception, the code will work. 

 

Joe

 

From: Nick Knight <nick.knight@...>
Date: Tuesday, September 29, 2020 at 1:31 PM
To: Joseph Rahmeh <Joseph.Rahmeh@...>
Cc: "tech-vector-ext@..." <tech-vector-ext@...>, Robert Golla <Robert.Golla@...>, Cohen Steed <Cohen.Steed@...>, Christopher Olson <Christopher.Olson@...>, Matthew Smittle <Matthew.Smittle@...>, Ajay Ingle <Ajay.Ingle@...>
Subject: Re: [RISC-V] [tech-vector-ext] Proposing more portable vector cod

 

CAUTION: This email originated from outside of Western Digital. Do not click on links or open attachments unless you recognize the sender and know that the content is safe.

 

Hi Joseph,

 

Thanks for your comments. I apologize, but I don't fully understand your proposal, or the problem it solves. To help explain my confusion, here are two thoughts.

 

The supported LMUL (and EMUL) values are 2^k (k = -3:3) on all implementations, so software requesting EMUL > 8 is illegal everywhere.

 

I agree that we could make widening instructions more flexible by having them decrease VL (and LMUL) so that EMUL becomes valid. The fault-first loads adjust VL automatically, so this is not without some precedent. However, In my opinion, it's too much of a burden to do this manually (using vsetvli), and I don't see any portability issues with that.

 

Best,

Nick Knight

 

On Tue, Sep 29, 2020 at 9:32 AM Joseph Rahmeh <Joseph.Rahmeh@...> wrote:

 

In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)

 

For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.

 

This seems unduly restrictive and will limit software portability.  If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.

 

Similarly,  code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not.  LMUL2 would be the highest supported group multiplier for the given SEW.

 

 


David Horner
 

Never say never.
Appears to be the mantra for V extension. 


On Tue, Sep 29, 2020, 15:06 Nick Knight, <nick.knight@...> wrote:
Hi Joseph,

Thanks for the clarification.

The wording in the spec is admittedly vague: "LMUL can have integer values 1,2,4,8.", etc. My understanding of the intent is that all implementations must support the full range of LMUL values.

Yes, the intent is that the V specification mandates LMUL of 8, 4 and 1.
Even for minimal systems of VLEN=128; not only for interoperability, but because it provides a substantial functional benefit.

Future extension to larger LMUL comencerate with expansion of register set to more than 32 will likely continue to trap if supported LMUL exceeded, however, there is an opportunity then to press for auto vl sizing.

Auto vl sizing was previously discussed for all ops, not just first fault loads, nor just for widening. It is not currently on the agenda for v1.0 release.

However, it is expected that components of the V extension would be separately implementable,  verifiable and certifiable.
A reduction of LMUL could be allowed in that context.
Even an expansion of EMUL to 2*max LMUL is still on the table for post v1.0.
Joseph, thank you so much for your insightful input.


I'll defer to others to confirm or deny this.
And thanks to you Nick for replying (before I did).

Best,
Nick

...

I agree that we could make widening instructions more flexible by having them decrease VL (and LMUL) so that EMUL becomes valid. The fault-first loads adjust VL automatically, so this is not without some precedent. However, In my opinion, it's too much of a burden to do this manually (using vsetvli), and I don't see any portability issues with that.

 

_._,_._,_