Proposing more portable vector cod
In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)
For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.
This seems unduly restrictive and will limit software portability. If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.
Similarly, code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not. LMUL2 would be the highest supported group multiplier for the given SEW.
In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)
For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.
This seems unduly restrictive and will limit software portability. If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.
Similarly, code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not. LMUL2 would be the highest supported group multiplier for the given SEW.
However, In my opinion, it's too much of a burden to do this manually (using vsetvli),
Best,Nick KnightOn Tue, Sep 29, 2020 at 9:32 AM Joseph Rahmeh <Joseph.Rahmeh@...> wrote:
In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)
For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.
This seems unduly restrictive and will limit software portability. If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.
Similarly, code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not. LMUL2 would be the highest supported group multiplier for the given SEW.
Hi Nick,
Thanks for the reply. I was not asking for non-power of 2 LMUL. I was asking about LMUL values not supported by some implementation.
Let’s say that for SEW=128, an implementation supports LMUL=1 but no other values of LMUL. If the software ties to use a widening operation with SEW=64 and LMUL=1, then the EEW for the wide operand will be 128 and the EMUL will be 2. The EEW/EMUL combination (128/2) is not supported on this implementation. However, 128/1 is. If we reduce VL instead of taking an illegal instruction exception, the code will work.
Joe
From: Nick Knight <nick.knight@...>
Date: Tuesday, September 29, 2020 at 1:31 PM
To: Joseph Rahmeh <Joseph.Rahmeh@...>
Cc: "tech-vector-ext@..." <tech-vector-ext@...>, Robert Golla <Robert.Golla@...>, Cohen Steed <Cohen.Steed@...>, Christopher Olson <Christopher.Olson@...>, Matthew Smittle <Matthew.Smittle@...>, Ajay Ingle <Ajay.Ingle@...>
Subject: Re: [RISC-V] [tech-vector-ext] Proposing more portable vector cod
CAUTION: This email originated from outside of Western Digital. Do not click on links or open attachments unless you recognize the sender and know that the content is safe.
Hi Joseph,
Thanks for your comments. I apologize, but I don't fully understand your proposal, or the problem it solves. To help explain my confusion, here are two thoughts.
The supported LMUL (and EMUL) values are 2^k (k = -3:3) on all implementations, so software requesting EMUL > 8 is illegal everywhere.
I agree that we could make widening instructions more flexible by having them decrease VL (and LMUL) so that EMUL becomes valid. The fault-first loads adjust VL automatically, so this is not without some precedent. However, In my opinion, it's too much of a burden to do this manually (using vsetvli), and I don't see any portability issues with that.
Best,
Nick Knight
In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)
For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.
This seems unduly restrictive and will limit software portability. If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.
Similarly, code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not. LMUL2 would be the highest supported group multiplier for the given SEW.
Hi Nick,
Thanks for the reply. I was not asking for non-power of 2 LMUL. I was asking about LMUL values not supported by some implementation.
Let’s say that for SEW=128, an implementation supports LMUL=1 but no other values of LMUL. If the software ties to use a widening operation with SEW=64 and LMUL=1, then the EEW for the wide operand will be 128 and the EMUL will be 2. The EEW/EMUL combination (128/2) is not supported on this implementation. However, 128/1 is. If we reduce VL instead of taking an illegal instruction exception, the code will work.
Joe
From: Nick Knight <nick.knight@...>
Date: Tuesday, September 29, 2020 at 1:31 PM
To: Joseph Rahmeh <Joseph.Rahmeh@...>
Cc: "tech-vector-ext@..." <tech-vector-ext@...>, Robert Golla <Robert.Golla@...>, Cohen Steed <Cohen.Steed@...>, Christopher Olson <Christopher.Olson@...>, Matthew Smittle <Matthew.Smittle@...>, Ajay Ingle <Ajay.Ingle@...>
Subject: Re: [RISC-V] [tech-vector-ext] Proposing more portable vector cod
CAUTION: This email originated from outside of Western Digital. Do not click on links or open attachments unless you recognize the sender and know that the content is safe.
Hi Joseph,
Thanks for your comments. I apologize, but I don't fully understand your proposal, or the problem it solves. To help explain my confusion, here are two thoughts.
The supported LMUL (and EMUL) values are 2^k (k = -3:3) on all implementations, so software requesting EMUL > 8 is illegal everywhere.
I agree that we could make widening instructions more flexible by having them decrease VL (and LMUL) so that EMUL becomes valid. The fault-first loads adjust VL automatically, so this is not without some precedent. However, In my opinion, it's too much of a burden to do this manually (using vsetvli), and I don't see any portability issues with that.
Best,
Nick Knight
On Tue, Sep 29, 2020 at 9:32 AM Joseph Rahmeh <Joseph.Rahmeh@...> wrote:
In the latest vector proposal (draft of version 1.0), there is the following restriction on widening instructions (section 11.2)
For all widening instructions, the destination EEW and EMUL values must be a supported configuration, otherwise an illegal instruction exception is raised.
This seems unduly restrictive and will limit software portability. If the destination EEW is supported but EMUL is not, it would improve code portability if strip-mining reduces VL accordingly instead of raising an exception.
Similarly, code would be more portable, if any proposed combination of SEW/LMUL is replaced by SEW/LMUL2 if SEW is supported and LMUL is not. LMUL2 would be the highest supported group multiplier for the given SEW.
Hi Joseph,Thanks for the clarification.The wording in the spec is admittedly vague: "LMUL can have integer values 1,2,4,8.", etc. My understanding of the intent is that all implementations must support the full range of LMUL values.
I'll defer to others to confirm or deny this.
Best,Nick...I agree that we could make widening instructions more flexible by having them decrease VL (and LMUL) so that EMUL becomes valid. The fault-first loads adjust VL automatically, so this is not without some precedent. However, In my opinion, it's too much of a burden to do this manually (using vsetvli), and I don't see any portability issues with that.
_._,_._,_