Date   

Re: Smaller embedded version of the Vector extension

Guy Lemieux
 

I’ve taken a stab at reducing the number of instructions in my RVV-lite proposal. The overriding goal, in my mind, is to preserve forward software compatibility so the ecosystem doesn’t need to fragment.

There are lots of instructions that are not essential which I have eliminated. Also, I have dropped or limited the scope of the widening and narrowing instructions — they are awkward to implement because they change the demand in register file read or write bandwidth
due to a mixing of data element sizes.

Limiting LMUL is far more difficult, because it is fundamental to the way RVV changes data widths. The best I could do in my proposal is require SEW/LMUL to always be 8.

I’m happy to share my proposal on request, but I’ve not broadcast it here because it still needs more work. I’d welcome any thoughts on improving it though.

Guy



On Sun, Jun 27, 2021 at 11:54 PM Gregory Kielian <gkielian@...> wrote:
Hi Everyone, wanted to continue this interesting discussion.


Was wondering if this is a complete listing of the requirements (so far) for the ZVE* extensions? or if there might be another document/spreadsheet/source-file which would have a running-list of requirements?

In particular, hoping to check if there might be a running-list of instructions required by the ZVE* extensions (e.g. if we would need to implement vector integer division) and the range of LMUL levels we would be required to support?

Looking forward to continuing the discussion.

All the best,
Gregory



Re: Smaller embedded version of the Vector extension

Gregory Kielian
 

Hi Everyone, wanted to continue this interesting discussion.


Was wondering if this is a complete listing of the requirements (so far) for the ZVE* extensions? or if there might be another document/spreadsheet/source-file which would have a running-list of requirements?

In particular, hoping to check if there might be a running-list of instructions required by the ZVE* extensions (e.g. if we would need to implement vector integer division) and the range of LMUL levels we would be required to support?

Looking forward to continuing the discussion.

All the best,
Gregory


Re: No vector TG meeting tomorrow - preparing to start public review

Bill Huffman
 

I did not expect that this meeting, of all meetings, would be cancelled.

 

So I’ll send my comments by email now rather than discussing an important one at the meeting and filing issues for minor ones.  I think #1 is important.  The rest are lesser comments:

  1. I’m (still) concerned about the lack of a whole register load with “mask” type hint.  I think leaving it out damages the clarity of support for micro-architectural redistribution, which is critical for wide SIMD.
    • If we really will not add the instruction, I’m thinking there’s a relatively simple sequence that does the job, like save vl, set it to max, do an ordinary mask load, restore vl.  Could you add that sequence somewhere.  Maybe in the note about not having the mask version.  That would help make the completeness of support clear for micro-architectural redistribution.
  2. In Section 3.5, I’m not sure why vlenb must be a design-time constant.  An implementation could vary it as a mode.  So, I think the wording should allow that.
  3. Section 13.19 has a typo.  It says “widening” when I think it means “narrowing.”
  4. In Section 14, what is the thinking on why there’s no update of the destination of reductions when vl=0?  I can’t see any reason not to update – it seems easier to update – and it could, in some circumstances matter.  This rule means that code has to ensure that it does not execute a reduction in a rare case that vl might be zero unless the source scalar and destination scalar are the same register.  Why require that?
  5. In Section 7.8, I thought we wanted the set of registers a segment load/store could access to be limited to an aligned group of eight to allow optimizations on the renaming of registers.  So a three register group could start at register #3, for example.  But here it says that a sequence can start anywhere.

 

    Bill

 

From: tech-vector-ext@... <tech-vector-ext@...> On Behalf Of Krste Asanovic
Sent: Friday, June 25, 2021 3:41 AM
To: tech-vector-ext@...
Subject: [RISC-V] [tech-vector-ext] No vector TG meeting tomorrow - preparing to start public review

 

EXTERNAL MAIL

There have been no substantial objections raised on the v1.0-rc1 draft, so I will cancel the meeting tomorrow.

There are some minor suggestions and edits (thank you!), and I will incorporate these into the v1.0 version which we’ll freeze for public review, hopefully starting in a day or at most a few days.

Remember, you can still comment during public review.

Krste

 


Re: No vector TG meeting tomorrow - preparing to start public review

David Horner
 

I am disappointed that the meeting was cancelled.

I am concerned that jnk0le's contributions/concerns, in particular, have been dismissed and to the extent that they were objections to the draft have been categorized as not substantial.

The meeting  was a final verification/certification opportunity. A ratification of sorts.

I have found our meetings productive and collectively raised concerns and insights that have improved the V extension and its documentation.

I intended to raise some concerns that are difficult to express as an "issue" and hoped to get the groups input more dynamically.

Given the meeting is cancelled, I will type these in as best I can as issues.

I apologize that I did not anticipate the cancellation of the meeting, and thus that I let my life interfere  with providing these concerns as issues.

[Our son suddenly was at the top of the waiting list for long term care placement, which means in Ontario that you drop everything to make it work. And further, with covid restrictions we needed to ensure, on admission, he has completely set up as he is now quarantined for 14 day. Also that  they messed up his medication and I spent the better part of a day, and the night it happened, getting that corrected. He has Dopamine Responsive Dystonia and takes levo/carbadopaCR and pramipexole. Thus, they thought he has Parkinsons and concentrated his medications to daytime hours, leaving him without his medication for 12 hours over night. It took a full 36 hours before he was stabilized and free of adverse effects. And this is happening on the background of us selling our house, downsizing, moving, new rental accommodation set up and shut off old services and legal matters, including document preps and signing papers. Once again, I apologize, I know of others in worse constraints that are on the lists etc. 24/7. So it is just me, given others can contribute when their life is hectic.]


Contributing during public review does not prepare the document for public consumption. I believe cancelling the meeting is a lost opportunity to do that.



On 2021-06-25 3:40 a.m., Krste Asanovic wrote:
There have been no substantial objections raised on the v1.0-rc1 draft, so I will cancel the meeting tomorrow.

There are some minor suggestions and edits (thank you!), and I will incorporate these into the v1.0 version which we’ll freeze for public review, hopefully starting in a day or at most a few days.

Remember, you can still comment during public review.

Krste


No vector TG meeting tomorrow - preparing to start public review

Krste Asanovic
 

There have been no substantial objections raised on the v1.0-rc1 draft, so I will cancel the meeting tomorrow.

There are some minor suggestions and edits (thank you!), and I will incorporate these into the v1.0 version which we’ll freeze for public review, hopefully starting in a day or at most a few days.

Remember, you can still comment during public review.

Krste


Potential Vector Task Group Meeting and v1.0-rc1 review reminder by June 25

Krste Asanovic
 

Unless there are significant issues raised by the group on the v1.0-rc1 spec, the intent is to go into public review on June 25th, so please make sure to give any feedback before then.
There are a few pending edits received already - thank you all for the feedback.

Reminder that there will be no vector TG meeting this Friday, but we have tentatively scheduled a meeting next Friday June 25.
We will only meet on June 25 if there are issues raised that would stall freeze and entering public review.

Remember, it is also possible to give feedback during the public review period.
The main goal before June 25 is to make sure the group is OK with this version being officially frozen and released to the wild for review.

Krste


Re: 回复:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Guy Lemieux
 

thanks, I see the api comments now too.

pretty easy to do using vslide1up/dn and vmerge

g

On Tue, Jun 15, 2021 at 2:12 PM David Horner <ds2horner@...> wrote:
@guy: I don't know what each letter stands for, but the link has this:
TRN1 Interleave even elements from two vectors .
I assume there is TRN for odds and maybe mixed?


On Tue, Jun 15, 2021, 03:34 Roger Ferrer Ibanez, <roger.ferrer@...> wrote:










Hi,



I didn't realise this was the private list and Romain is not in

this one. So I'm forwarding his answer on his behalf








There's some sample codes to emulate SVE/NEON-like instructions

using the historical EPI intrinsics (should be mostly

self-explanatory, Roger will definitely know them ) here:







<https://github.com/rdolbeau/EPI-test-codes-vector/blob/master/simplefft16-rvv/vf64.h>







Other subdirectories of the same repo includes integer variants.







Not all functions might work for non-power-of-two length, but

they can be a starting point for some useful pattern.







Cordially,



Hope this helps.




Kind regards,




On 15/6/21 9:08, Roger Ferrer Ibanez

wrote:








Hi,



I agree that computing those indexes is not always trivial



Some ideas you can consider (not claiming these are the most

efficient ways)






  • reverse is not too complex: vid.v + vrsub.vx using vl as the

    scalar to subtract




  • zip is harder: start with the "halved indexes" vid + vsrl.vi

    (0, 0, 1, 1, 2, 2, 3, 3, ...) then compute the "even elements"

    vector (0, 1, 0, 1, 0, 1, ...) and then multiply it (or if a

    power of two, shift) with the first index of the second vector

    (which maybe is vl/2 in your case). So you get 0, vl/2, 0,

    vl/2, 0, vl/2, .... Then add this vector to the halved indexes

    so you get 0, vl/2, 1, 1+vl/2, 2, 2+vl/2, ...




  • unzip worst of the cases you can reverse what you did for

    zip




  • trn I don't have any ideas from the top of my head, /cc

    Romain Dolbeau who may recall how he worked around the cases

    in FFTW where he needed a trn-like operation




Hope this helps.



Kind regards,




On 15/6/21 8:14, Linjie Yu via

lists.riscv.org wrote:












Dear  Craig and Roger,







     ​    ​Thanks a lot for providing  me  good solutionI have tried them,  they are

all good solutions of 
upsample

application.


    ​    ​ But, when it comes to other applications, such  as zip/unzip, trn,reverse and

 so on.  The Index value  is 
still difficult to be initialized.









Best Regards




Damon Yu




















------------------原始邮件

------------------




发送时间:06/11/21 18:51:12




主题:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector

Spec version 1.0-rc1-20210608




Hi Linjie,



I'm not sure I understood your question. I think a vid.v (with a vl of your choice) that then you

(logical) shift right 1 bit (vsrl.vi)

would generate an index like the one you have now the "index[]" array.




This looks like it does not require to hardcode any size

and you don't have to load a materialised value from memory

(you compute it instead).



Hope this helps.




Kind regards,




On 11/6/21 9:22, Linjie Yu via

lists.riscv.org wrote:












Hi, all







    I encountered a difficulty of applying

 "vrgather" instruction recently.  The details are shown

blow:


    The date from source should be duplicated as

pair in a upsample application. 


     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]


           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4,

5, 5, 6, 6, 7, 7, 8, 8]


     So, my relazation is:


--------------------------------------------------------------------------------------------------


  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5,

5 ........,31,31};// to be compatibled of all the

VLEN(128 ~ 1024)







   vfloat32m2_t data = vundefined_f32m2();




    vfloat32m1_t  zero

= ( vfloat32m1_t )vmv_v_i_i32m1(0);


   while(length >0)


    {


         int gvl = vsetvl_e32m1(length);


        vuint32m2_t v_index = vle32_v_u32m2(index,

gvl);




         vfloat32m1_t src_data = vle32_v_f32m1(src,

gvl);


         data = vset_f32m2(data, src_data, 0);




         data  = vset_f32m2(data, zero, 1);


         vfloat32m2_t res =

vrrgather_vv_f32m2(data, v_index, gvl); 
 


         length -=gvl;




         src += gvl;




        vse32_v_f32m2(out, res, gvl);


        dst +=gvl;




    }


-----------------------------------------------------------------------------------------------

  


    As shown before, the index data should be

 initialized as the max VLEN to make the code  compatibled.


So do all the applications,

that need a constant.


    I think it is contrary to

the idea of RISC-V, that one code can run on all the

RISC-V hardware.  Does anyone have a better method














Best Regards




Damon Yu




















------------------原始邮件

------------------




发送时间:06/09/21 14:46:30




主题:[RISC-V] [tech-vector-ext] RISC-V Vector

Spec version 1.0-rc1-20210608





I've just tagged the first release candidate for v1.0 of the vector


spec in github.  PDF attached below.





I've included the TG agreed updates and handled almost all of the


outstanding issues for v1.0.  Thanks for all the feedback.





I would appreciate if folks could read through the whole thing and


give comments over email and through Github issues.  Please also


submit PRs for typos and other wording improvements.





I'd like to try and settle most concerns over email if possible, and


assume it'll take a little while for everyone to go through the doc.





I'll tentatively schedule a vector TG meeting on Friday June 25 to go


over issues best dealt with on a call.  I'm hoping we can enter public


review at the same point in time.  Remember, we don't have to reach


agreement on all the issues before starting public review, just be OK


as a group with putting this out for public review.





Krste
























-- Roger Ferrer Ibáñez - roger.ferrer@...Barcelona Supercomputing Center - Centro Nacional de Supercomputación








WARNING / LEGAL TEXT: This message is intended only for the

use of theindividual or entity to which it is addressed and

may containinformation which is privileged, confidential,

proprietary, or exemptfrom disclosure under applicable law. If

you are not the intendedrecipient or the person responsible

for delivering the message to theintended recipient, you are

strictly prohibited from disclosing,distributing, copying, or

in any way using this message. If you havereceived this

communication in error, please notify the sender anddestroy

and delete any copies you may have received.





http://www.bsc.es/disclaimer






-- 

Roger Ferrer Ibáñez - roger.ferrer@...

Barcelona Supercomputing Center - Centro Nacional de Supercomputación






-- 

Roger Ferrer Ibáñez - roger.ferrer@...

Barcelona Supercomputing Center - Centro Nacional de Supercomputación



























Re: 回复:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

David Horner
 

@guy: I don't know what each letter stands for, but the link has this:
TRN1 Interleave even elements from two vectors .
I assume there is TRN for odds and maybe mixed?


On Tue, Jun 15, 2021, 03:34 Roger Ferrer Ibanez, <roger.ferrer@...> wrote:

Hi,

I didn't realise this was the private list and Romain is not in this one. So I'm forwarding his answer on his behalf


There's some sample codes to emulate SVE/NEON-like instructions using the historical EPI intrinsics (should be mostly self-explanatory, Roger will definitely know them ) here:

<https://github.com/rdolbeau/EPI-test-codes-vector/blob/master/simplefft16-rvv/vf64.h>

Other subdirectories of the same repo includes integer variants.

Not all functions might work for non-power-of-two length, but they can be a starting point for some useful pattern.

Cordially,
Hope this helps.

Kind regards,

On 15/6/21 9:08, Roger Ferrer Ibanez wrote:

Hi,

I agree that computing those indexes is not always trivial

Some ideas you can consider (not claiming these are the most efficient ways)

  • reverse is not too complex: vid.v + vrsub.vx using vl as the scalar to subtract
  • zip is harder: start with the "halved indexes" vid + vsrl.vi (0, 0, 1, 1, 2, 2, 3, 3, ...) then compute the "even elements" vector (0, 1, 0, 1, 0, 1, ...) and then multiply it (or if a power of two, shift) with the first index of the second vector (which maybe is vl/2 in your case). So you get 0, vl/2, 0, vl/2, 0, vl/2, .... Then add this vector to the halved indexes so you get 0, vl/2, 1, 1+vl/2, 2, 2+vl/2, ...
  • unzip worst of the cases you can reverse what you did for zip
  • trn I don't have any ideas from the top of my head, /cc Romain Dolbeau who may recall how he worked around the cases in FFTW where he needed a trn-like operation

Hope this helps.

Kind regards,

On 15/6/21 8:14, Linjie Yu via lists.riscv.org wrote:
Dear  Craig and Roger,

     ​    ​Thanks a lot for providing  me  good solutionI have tried them,  they are all good solutions of  upsample application.
    ​    ​ But, when it comes to other applications, such  as zip/unzip, trn,reverse and  so on.  The Index value  is still difficult to be initialized.

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/11/21 18:51:12
主题:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Hi Linjie,

I'm not sure I understood your question. I think a vid.v (with a vl of your choice) that then you (logical) shift right 1 bit (vsrl.vi) would generate an index like the one you have now the "index[]" array.

This looks like it does not require to hardcode any size and you don't have to load a materialised value from memory (you compute it instead).

Hope this helps.

Kind regards,

On 11/6/21 9:22, Linjie Yu via lists.riscv.org wrote:
Hi, all

    I encountered a difficulty of applying  "vrgather" instruction recently.  The details are shown blow:
    The date from source should be duplicated as pair in a upsample application. 
     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]
           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8]
     So, my relazation is:
--------------------------------------------------------------------------------------------------
  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 ........,31,31};// to be compatibled of all the VLEN(128 ~ 1024)

   vfloat32m2_t data = vundefined_f32m2();
    vfloat32m1_t  zero = ( vfloat32m1_t )vmv_v_i_i32m1(0);
   while(length >0)
    {
         int gvl = vsetvl_e32m1(length);
        vuint32m2_t v_index = vle32_v_u32m2(index, gvl);
         vfloat32m1_t src_data = vle32_v_f32m1(src, gvl);
         data = vset_f32m2(data, src_data, 0);
         data  = vset_f32m2(data, zero, 1);
         vfloat32m2_t res = vrrgather_vv_f32m2(data, v_index, gvl);  
         length -=gvl;
         src += gvl;
        vse32_v_f32m2(out, res, gvl);
        dst +=gvl;
    }
-----------------------------------------------------------------------------------------------   
    As shown before, the index data should be  initialized as the max VLEN to make the code  compatibled.
So do all the applications, that need a constant.
    I think it is contrary to the idea of RISC-V, that one code can run on all the RISC-V hardware.  Does anyone have a better method ? 

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/09/21 14:46:30
主题:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

I've just tagged the first release candidate for v1.0 of the vector
spec in github.  PDF attached below.

I've included the TG agreed updates and handled almost all of the
outstanding issues for v1.0.  Thanks for all the feedback.

I would appreciate if folks could read through the whole thing and
give comments over email and through Github issues.  Please also
submit PRs for typos and other wording improvements.

I'd like to try and settle most concerns over email if possible, and
assume it'll take a little while for everyone to go through the doc.

I'll tentatively schedule a vector TG meeting on Friday June 25 to go
over issues best dealt with on a call.  I'm hoping we can enter public
review at the same point in time.  Remember, we don't have to reach
agreement on all the issues before starting public review, just be OK
as a group with putting this out for public review.

Krste






-- Roger Ferrer Ibáñez - roger.ferrer@...Barcelona Supercomputing Center - Centro Nacional de Supercomputación


WARNING / LEGAL TEXT: This message is intended only for the use of theindividual or entity to which it is addressed and may containinformation which is privileged, confidential, proprietary, or exemptfrom disclosure under applicable law. If you are not the intendedrecipient or the person responsible for delivering the message to theintended recipient, you are strictly prohibited from disclosing,distributing, copying, or in any way using this message. If you havereceived this communication in error, please notify the sender anddestroy and delete any copies you may have received.

http://www.bsc.es/disclaimer
-- 
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación
-- 
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación


Re: 回复:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Roger Ferrer Ibanez
 

Hi,

I didn't realise this was the private list and Romain is not in this one. So I'm forwarding his answer on his behalf


There's some sample codes to emulate SVE/NEON-like instructions using the historical EPI intrinsics (should be mostly self-explanatory, Roger will definitely know them ) here:

<https://github.com/rdolbeau/EPI-test-codes-vector/blob/master/simplefft16-rvv/vf64.h>

Other subdirectories of the same repo includes integer variants.

Not all functions might work for non-power-of-two length, but they can be a starting point for some useful pattern.

Cordially,
Hope this helps.

Kind regards,

On 15/6/21 9:08, Roger Ferrer Ibanez wrote:

Hi,

I agree that computing those indexes is not always trivial

Some ideas you can consider (not claiming these are the most efficient ways)

  • reverse is not too complex: vid.v + vrsub.vx using vl as the scalar to subtract
  • zip is harder: start with the "halved indexes" vid + vsrl.vi (0, 0, 1, 1, 2, 2, 3, 3, ...) then compute the "even elements" vector (0, 1, 0, 1, 0, 1, ...) and then multiply it (or if a power of two, shift) with the first index of the second vector (which maybe is vl/2 in your case). So you get 0, vl/2, 0, vl/2, 0, vl/2, .... Then add this vector to the halved indexes so you get 0, vl/2, 1, 1+vl/2, 2, 2+vl/2, ...
  • unzip worst of the cases you can reverse what you did for zip
  • trn I don't have any ideas from the top of my head, /cc Romain Dolbeau who may recall how he worked around the cases in FFTW where he needed a trn-like operation

Hope this helps.

Kind regards,

On 15/6/21 8:14, Linjie Yu via lists.riscv.org wrote:
Dear  Craig and Roger,

     ​    ​Thanks a lot for providing  me  good solutionI have tried them,  they are all good solutions of  upsample application.
    ​    ​ But, when it comes to other applications, such  as zip/unzip, trn,reverse and  so on.  The Index value  is still difficult to be initialized.

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/11/21 18:51:12
主题:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Hi Linjie,

I'm not sure I understood your question. I think a vid.v (with a vl of your choice) that then you (logical) shift right 1 bit (vsrl.vi) would generate an index like the one you have now the "index[]" array.

This looks like it does not require to hardcode any size and you don't have to load a materialised value from memory (you compute it instead).

Hope this helps.

Kind regards,

On 11/6/21 9:22, Linjie Yu via lists.riscv.org wrote:
Hi, all

    I encountered a difficulty of applying  "vrgather" instruction recently.  The details are shown blow:
    The date from source should be duplicated as pair in a upsample application. 
     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]
           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8]
     So, my relazation is:
--------------------------------------------------------------------------------------------------
  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 ........,31,31};// to be compatibled of all the VLEN(128 ~ 1024)

   vfloat32m2_t data = vundefined_f32m2();
    vfloat32m1_t  zero = ( vfloat32m1_t )vmv_v_i_i32m1(0);
   while(length >0)
    {
         int gvl = vsetvl_e32m1(length);
        vuint32m2_t v_index = vle32_v_u32m2(index, gvl);
         vfloat32m1_t src_data = vle32_v_f32m1(src, gvl);
         data = vset_f32m2(data, src_data, 0);
         data  = vset_f32m2(data, zero, 1);
         vfloat32m2_t res = vrrgather_vv_f32m2(data, v_index, gvl);  
         length -=gvl;
         src += gvl;
        vse32_v_f32m2(out, res, gvl);
        dst +=gvl;
    }
-----------------------------------------------------------------------------------------------   
    As shown before, the index data should be  initialized as the max VLEN to make the code  compatibled.
So do all the applications, that need a constant.
    I think it is contrary to the idea of RISC-V, that one code can run on all the RISC-V hardware.  Does anyone have a better method ? 

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/09/21 14:46:30
主题:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

I've just tagged the first release candidate for v1.0 of the vector
spec in github.  PDF attached below.

I've included the TG agreed updates and handled almost all of the
outstanding issues for v1.0.  Thanks for all the feedback.

I would appreciate if folks could read through the whole thing and
give comments over email and through Github issues.  Please also
submit PRs for typos and other wording improvements.

I'd like to try and settle most concerns over email if possible, and
assume it'll take a little while for everyone to go through the doc.

I'll tentatively schedule a vector TG meeting on Friday June 25 to go
over issues best dealt with on a call.  I'm hoping we can enter public
review at the same point in time.  Remember, we don't have to reach
agreement on all the issues before starting public review, just be OK
as a group with putting this out for public review.

Krste






-- Roger Ferrer Ibáñez - roger.ferrer@...Barcelona Supercomputing Center - Centro Nacional de Supercomputación


WARNING / LEGAL TEXT: This message is intended only for the use of theindividual or entity to which it is addressed and may containinformation which is privileged, confidential, proprietary, or exemptfrom disclosure under applicable law. If you are not the intendedrecipient or the person responsible for delivering the message to theintended recipient, you are strictly prohibited from disclosing,distributing, copying, or in any way using this message. If you havereceived this communication in error, please notify the sender anddestroy and delete any copies you may have received.

http://www.bsc.es/disclaimer
-- 
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación
-- 
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación


Re: 回复:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Guy Lemieux
 

for zip, you don’t need to use vrgather. instead, use vector widening (eg vwaddu with x0) to double the element size (ensuring the MSBs are 0) for one half of the set.  
apply vslide1up and widening to the second set, then add the two sets (using original SEW).

for unzip, you can do the reverse. use narrowing for one half and vslide1dn (in SEW/2) + narrowing for the other half.

sorry, what is trn?

g


On Tue, Jun 15, 2021 at 12:08 AM Roger Ferrer Ibanez <roger.ferrer@...> wrote:










Hi,



I agree that computing those indexes is not always trivial



Some ideas you can consider (not claiming these are the most

efficient ways)






  • reverse is not too complex: vid.v + vrsub.vx using vl as the

    scalar to subtract




  • zip is harder: start with the "halved indexes" vid + vsrl.vi

    (0, 0, 1, 1, 2, 2, 3, 3, ...) then compute the "even elements"

    vector (0, 1, 0, 1, 0, 1, ...) and then multiply it (or if a

    power of two, shift) with the first index of the second vector

    (which maybe is vl/2 in your case). So you get 0, vl/2, 0, vl/2,

    0, vl/2, .... Then add this vector to the halved indexes so you

    get 0, vl/2, 1, 1+vl/2, 2, 2+vl/2, ...




  • unzip worst of the cases you can reverse what you did for zip




  • trn I don't have any ideas from the top of my head, /cc Romain

    Dolbeau who may recall how he worked around the cases in FFTW

    where he needed a trn-like operation




Hope this helps.



Kind regards,




On 15/6/21 8:14, Linjie Yu via

lists.riscv.org wrote:












Dear  Craig and Roger,







     ​    ​Thanks a lot for providing  me  good

solutionI have tried them,  they are all good solutions of 

upsample application.


    ​    ​ But, when it comes to other applications, such  as zip/unzip,

trn,reverse and  so on.  The Index value  is 
still difficult to be initialized.









Best Regards




Damon Yu




















------------------原始邮件

------------------




发送时间:06/11/21 18:51:12




主题:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector

Spec version 1.0-rc1-20210608




Hi Linjie,



I'm not sure I understood your question. I think a vid.v (with a vl of your choice) that then you

(logical) shift right 1 bit (vsrl.vi)

would generate an index like the one you have now the "index[]" array.




This looks like it does not require to hardcode any size and

you don't have to load a materialised value from memory (you

compute it instead).



Hope this helps.




Kind regards,




On 11/6/21 9:22, Linjie Yu via

lists.riscv.org wrote:












Hi, all







    I encountered a difficulty of applying

 "vrgather" instruction recently.  The details are shown blow:


    The date from source should be duplicated as pair

in a upsample application. 


     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]


           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5,

5, 6, 6, 7, 7, 8, 8]


     So, my relazation is:


--------------------------------------------------------------------------------------------------


  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5

........,31,31};// to be compatibled of all the VLEN(128

~ 1024)







   vfloat32m2_t data = vundefined_f32m2();




    vfloat32m1_t  zero =

( vfloat32m1_t )vmv_v_i_i32m1(0);


   while(length >0)


    {


         int gvl = vsetvl_e32m1(length);


        vuint32m2_t v_index = vle32_v_u32m2(index,

gvl);




         vfloat32m1_t src_data = vle32_v_f32m1(src,

gvl);


         data = vset_f32m2(data, src_data, 0);




         data  = vset_f32m2(data, zero, 1);


         vfloat32m2_t res =

vrrgather_vv_f32m2(data, v_index, gvl); 
 


         length -=gvl;




         src += gvl;




        vse32_v_f32m2(out, res, gvl);


        dst +=gvl;




    }


-----------------------------------------------------------------------------------------------

  


    As shown before, the index data should be

 initialized as the max VLEN to make the code  compatibled.


So do all the

applications, that need a constant.


    I think it is

contrary to the idea of RISC-V, that one code can run

on all the RISC-V hardware.  Does anyone have a better

method ? 













Best Regards




Damon Yu




















------------------原始邮件

------------------




发送时间:06/09/21 14:46:30




主题:[RISC-V] [tech-vector-ext] RISC-V Vector Spec

version 1.0-rc1-20210608





I've just tagged the first release candidate for v1.0 of the vector


spec in github.  PDF attached below.





I've included the TG agreed updates and handled almost all of the


outstanding issues for v1.0.  Thanks for all the feedback.





I would appreciate if folks could read through the whole thing and


give comments over email and through Github issues.  Please also


submit PRs for typos and other wording improvements.





I'd like to try and settle most concerns over email if possible, and


assume it'll take a little while for everyone to go through the doc.





I'll tentatively schedule a vector TG meeting on Friday June 25 to go


over issues best dealt with on a call.  I'm hoping we can enter public


review at the same point in time.  Remember, we don't have to reach


agreement on all the issues before starting public review, just be OK


as a group with putting this out for public review.





Krste
























-- Roger Ferrer Ibáñez - roger.ferrer@...Barcelona Supercomputing Center - Centro Nacional de Supercomputación








WARNING / LEGAL TEXT: This message is intended only for the use

of theindividual or entity to which it is addressed and may

containinformation which is privileged, confidential,

proprietary, or exemptfrom disclosure under applicable law. If

you are not the intendedrecipient or the person responsible for

delivering the message to theintended recipient, you are

strictly prohibited from disclosing,distributing, copying, or in

any way using this message. If you havereceived this

communication in error, please notify the sender anddestroy and

delete any copies you may have received.





http://www.bsc.es/disclaimer








-- 

Roger Ferrer Ibáñez - roger.ferrer@...

Barcelona Supercomputing Center - Centro Nacional de Supercomputación















Re: 回复:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Roger Ferrer Ibanez
 

Hi,

I agree that computing those indexes is not always trivial

Some ideas you can consider (not claiming these are the most efficient ways)

  • reverse is not too complex: vid.v + vrsub.vx using vl as the scalar to subtract
  • zip is harder: start with the "halved indexes" vid + vsrl.vi (0, 0, 1, 1, 2, 2, 3, 3, ...) then compute the "even elements" vector (0, 1, 0, 1, 0, 1, ...) and then multiply it (or if a power of two, shift) with the first index of the second vector (which maybe is vl/2 in your case). So you get 0, vl/2, 0, vl/2, 0, vl/2, .... Then add this vector to the halved indexes so you get 0, vl/2, 1, 1+vl/2, 2, 2+vl/2, ...
  • unzip worst of the cases you can reverse what you did for zip
  • trn I don't have any ideas from the top of my head, /cc Romain Dolbeau who may recall how he worked around the cases in FFTW where he needed a trn-like operation

Hope this helps.

Kind regards,

On 15/6/21 8:14, Linjie Yu via lists.riscv.org wrote:
Dear  Craig and Roger,

     ​    ​Thanks a lot for providing  me  good solutionI have tried them,  they are all good solutions of  upsample application.
    ​    ​ But, when it comes to other applications, such  as zip/unzip, trn,reverse and  so on.  The Index value  is still difficult to be initialized.

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/11/21 18:51:12
主题:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Hi Linjie,

I'm not sure I understood your question. I think a vid.v (with a vl of your choice) that then you (logical) shift right 1 bit (vsrl.vi) would generate an index like the one you have now the "index[]" array.

This looks like it does not require to hardcode any size and you don't have to load a materialised value from memory (you compute it instead).

Hope this helps.

Kind regards,

On 11/6/21 9:22, Linjie Yu via lists.riscv.org wrote:
Hi, all

    I encountered a difficulty of applying  "vrgather" instruction recently.  The details are shown blow:
    The date from source should be duplicated as pair in a upsample application. 
     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]
           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8]
     So, my relazation is:
--------------------------------------------------------------------------------------------------
  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 ........,31,31};// to be compatibled of all the VLEN(128 ~ 1024)

   vfloat32m2_t data = vundefined_f32m2();
    vfloat32m1_t  zero = ( vfloat32m1_t )vmv_v_i_i32m1(0);
   while(length >0)
    {
         int gvl = vsetvl_e32m1(length);
        vuint32m2_t v_index = vle32_v_u32m2(index, gvl);
         vfloat32m1_t src_data = vle32_v_f32m1(src, gvl);
         data = vset_f32m2(data, src_data, 0);
         data  = vset_f32m2(data, zero, 1);
         vfloat32m2_t res = vrrgather_vv_f32m2(data, v_index, gvl);  
         length -=gvl;
         src += gvl;
        vse32_v_f32m2(out, res, gvl);
        dst +=gvl;
    }
-----------------------------------------------------------------------------------------------   
    As shown before, the index data should be  initialized as the max VLEN to make the code  compatibled.
So do all the applications, that need a constant.
    I think it is contrary to the idea of RISC-V, that one code can run on all the RISC-V hardware.  Does anyone have a better method ? 

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/09/21 14:46:30
主题:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

I've just tagged the first release candidate for v1.0 of the vector
spec in github.  PDF attached below.

I've included the TG agreed updates and handled almost all of the
outstanding issues for v1.0.  Thanks for all the feedback.

I would appreciate if folks could read through the whole thing and
give comments over email and through Github issues.  Please also
submit PRs for typos and other wording improvements.

I'd like to try and settle most concerns over email if possible, and
assume it'll take a little while for everyone to go through the doc.

I'll tentatively schedule a vector TG meeting on Friday June 25 to go
over issues best dealt with on a call.  I'm hoping we can enter public
review at the same point in time.  Remember, we don't have to reach
agreement on all the issues before starting public review, just be OK
as a group with putting this out for public review.

Krste






-- Roger Ferrer Ibáñez - roger.ferrer@...Barcelona Supercomputing Center - Centro Nacional de Supercomputación


WARNING / LEGAL TEXT: This message is intended only for the use of theindividual or entity to which it is addressed and may containinformation which is privileged, confidential, proprietary, or exemptfrom disclosure under applicable law. If you are not the intendedrecipient or the person responsible for delivering the message to theintended recipient, you are strictly prohibited from disclosing,distributing, copying, or in any way using this message. If you havereceived this communication in error, please notify the sender anddestroy and delete any copies you may have received.

http://www.bsc.es/disclaimer
-- 
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación


回复:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Linjie Yu
 

Dear  Craig and Roger,

     ​    ​Thanks a lot for providing  me  good solutionI have tried them,  they are all good solutions of  upsample application.
    ​    ​ But, when it comes to other applications, such  as zip/unzip, trn,reverse and  so on.  The Index value  is still difficult to be initialized.

Best Regards
Damon Yu


------------------原始邮件 ------------------
发件人: <tech-vector-ext@...>
发送时间:06/11/21 18:51:12
收件人: <tech-vector-ext@...>
主题:Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Hi Linjie,

I'm not sure I understood your question. I think a vid.v (with a vl of your choice) that then you (logical) shift right 1 bit (vsrl.vi) would generate an index like the one you have now the "index[]" array.

This looks like it does not require to hardcode any size and you don't have to load a materialised value from memory (you compute it instead).

Hope this helps.

Kind regards,

On 11/6/21 9:22, Linjie Yu via lists.riscv.org wrote:
Hi, all

    I encountered a difficulty of applying  "vrgather" instruction recently.  The details are shown blow:
    The date from source should be duplicated as pair in a upsample application. 
     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]
           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8]
     So, my relazation is:
--------------------------------------------------------------------------------------------------
  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 ........,31,31};// to be compatibled of all the VLEN(128 ~ 1024)

   vfloat32m2_t data = vundefined_f32m2();
    vfloat32m1_t  zero = ( vfloat32m1_t )vmv_v_i_i32m1(0);
   while(length >0)
    {
         int gvl = vsetvl_e32m1(length);
        vuint32m2_t v_index = vle32_v_u32m2(index, gvl);
         vfloat32m1_t src_data = vle32_v_f32m1(src, gvl);
         data = vset_f32m2(data, src_data, 0);
         data  = vset_f32m2(data, zero, 1);
         vfloat32m2_t res = vrrgather_vv_f32m2(data, v_index, gvl);  
         length -=gvl;
         src += gvl;
        vse32_v_f32m2(out, res, gvl);
        dst +=gvl;
    }
-----------------------------------------------------------------------------------------------   
    As shown before, the index data should be  initialized as the max VLEN to make the code  compatibled.
So do all the applications, that need a constant.
    I think it is contrary to the idea of RISC-V, that one code can run on all the RISC-V hardware.  Does anyone have a better method ? 

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/09/21 14:46:30
主题:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

I've just tagged the first release candidate for v1.0 of the vector
spec in github.  PDF attached below.

I've included the TG agreed updates and handled almost all of the
outstanding issues for v1.0.  Thanks for all the feedback.

I would appreciate if folks could read through the whole thing and
give comments over email and through Github issues.  Please also
submit PRs for typos and other wording improvements.

I'd like to try and settle most concerns over email if possible, and
assume it'll take a little while for everyone to go through the doc.

I'll tentatively schedule a vector TG meeting on Friday June 25 to go
over issues best dealt with on a call.  I'm hoping we can enter public
review at the same point in time.  Remember, we don't have to reach
agreement on all the issues before starting public review, just be OK
as a group with putting this out for public review.

Krste






-- Roger Ferrer Ibáñez - roger.ferrer@...Barcelona Supercomputing Center - Centro Nacional de Supercomputación


WARNING / LEGAL TEXT: This message is intended only for the use of theindividual or entity to which it is addressed and may containinformation which is privileged, confidential, proprietary, or exemptfrom disclosure under applicable law. If you are not the intendedrecipient or the person responsible for delivering the message to theintended recipient, you are strictly prohibited from disclosing,distributing, copying, or in any way using this message. If you havereceived this communication in error, please notify the sender anddestroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


Background for Policy/Workflow revisions on Github close concern.

David Horner
 

Andrew Waterman called Thursday and we discussed many issues including challenges with Issues in Github.

We determined that both were unaware of some relevant aspects [neither of us intentionally blind];  and neither were exaggerating the our concerns [neither of us Chicken Littles].

Andrew volunteered me to look into Github:

Ffunctionality and features that might diminish some of the contention/problems related to "workflow" and "issue expression".

I consider myself duly deputized to:

  Recommend modifications to a de facto proposal by Krste on how to use Github, and

  To make any further suggestions that I consider valuable.


But before I get into that specific concern, there are related concerns

a) timeliness and completion of TG/SIG/other minutes

b) github issues vs group/list discussions

c) Philosophies and World Views.


It is the latter that I want to expound to help us understand the origin of some of the conflicts and to plot directions for resolution/enhancement in ISA development.


On the call, I admitted to being Process and Enablement oriented,  Andrew to being Task and Results oriented.


Those familiar with Holism vs Reductionism, know/understand that

1) both views are necessary to achieve significant results

2) at any point [level] in analysis there is value in considering

   i) the holistic nature of the concern/situation/object; its behaviour as a whole and response in the context of its environment [a sloppy definition, I know] and

   ii) the examination of its components; what it is made of, and why the what is needed [if needed] for what it does and doesn't do. [an even worse definition].

3) [most] persons have a bias towards one or other mode of analysis, but no one [who analyzes] is wholly one or the other.


These points are applicable to Process|Enablement  vs Task|Results.

1) both views are necessary to achieve significant results

2) at any point [level] in accomplishment there is value in considering


i) the Process|Enablement context; what enables the activity/accomplishment to occur.
How can it be enhanced/leveraged to assist in providing the desired outcome.
Put into place [Enable] the resources to accomplish the goal and let the process deliver the results. 

ii) identify the Task/activity that will [hopefully] yield the result and work that activity to completion
Verify the results are as hope for and check the task off the list
[ or check the task off the list and move to the next Task of verifying the result].

3) [most] persons have a bias towards one or other mode of operation.


So, back to Github and Issue Resolution.

A. These different Word Views may conflict at any given point [level] of endeavor.


It occurred in this instance in the conflict of when an issue should be closed.

For the Task|Result oriented, Issue Resolution is closing the issue.
     Task done, move on to next.


For the Process|Enablement, Issue Resolution is completing all the concerns expressed by ensuring a process is in place to address them.

     Closing prematurely aborts that process.


B. Github does not provide  robust Worrkflow Lifecycle in its base product for Issues.

For pull requests there is a support structure with: validations,  review requests/responses, and "sign-off" provisions.

Issue Life-cycle support is bninary/open close status [no UnderInvestigation,RequestingConsutation,TentativeResolution, Awaiting SignOff].

It does provide links to/from other issues, and more significantly has a CloseRelatedIssue Button when a pull is being applied to the repository.

Andrew mentioned the strong temptation to use this button when "finalizing" a pull request.

External email scripting could provide some functionality that would enhance the workflow,

  but the use agreement appears to discourage if not explicitly forbid such "non-standard" interfaces.

Github may have a pay-for-feature process [I remember such before its recent acquisition], however,

  I believe we can make do with moderate behavioural changes and still be effective in serving all of the RISCV community.


C. Oh. Contrary to Andrew's interference from https://github.com/riscv/riscv-isa-manual/pull/657#issuecomment-858481023 ,

          I do not believe unilateral control is desirable nor essential to Issue management "workflow".


I will have specific proposals soon.


Re: 回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Roger Ferrer Ibanez
 

Hi Linjie,

I'm not sure I understood your question. I think a vid.v (with a vl of your choice) that then you (logical) shift right 1 bit (vsrl.vi) would generate an index like the one you have now the "index[]" array.

This looks like it does not require to hardcode any size and you don't have to load a materialised value from memory (you compute it instead).

Hope this helps.

Kind regards,

On 11/6/21 9:22, Linjie Yu via lists.riscv.org wrote:
Hi, all

    I encountered a difficulty of applying  "vrgather" instruction recently.  The details are shown blow:
    The date from source should be duplicated as pair in a upsample application. 
     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]
           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8]
     So, my relazation is:
--------------------------------------------------------------------------------------------------
  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 ........,31,31};// to be compatibled of all the VLEN(128 ~ 1024)

   vfloat32m2_t data = vundefined_f32m2();
    vfloat32m1_t  zero = ( vfloat32m1_t )vmv_v_i_i32m1(0);
   while(length >0)
    {
         int gvl = vsetvl_e32m1(length);
        vuint32m2_t v_index = vle32_v_u32m2(index, gvl);
         vfloat32m1_t src_data = vle32_v_f32m1(src, gvl);
         data = vset_f32m2(data, src_data, 0);
         data  = vset_f32m2(data, zero, 1);
         vfloat32m2_t res = vrrgather_vv_f32m2(data, v_index, gvl);  
         length -=gvl;
         src += gvl;
        vse32_v_f32m2(out, res, gvl);
        dst +=gvl;
    }
-----------------------------------------------------------------------------------------------   
    As shown before, the index data should be  initialized as the max VLEN to make the code  compatibled.
So do all the applications, that need a constant.
    I think it is contrary to the idea of RISC-V, that one code can run on all the RISC-V hardware.  Does anyone have a better method ? 

Best Regards
Damon Yu


------------------原始邮件 ------------------
发送时间:06/09/21 14:46:30
主题:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

I've just tagged the first release candidate for v1.0 of the vector
spec in github.  PDF attached below.

I've included the TG agreed updates and handled almost all of the
outstanding issues for v1.0.  Thanks for all the feedback.

I would appreciate if folks could read through the whole thing and
give comments over email and through Github issues.  Please also
submit PRs for typos and other wording improvements.

I'd like to try and settle most concerns over email if possible, and
assume it'll take a little while for everyone to go through the doc.

I'll tentatively schedule a vector TG meeting on Friday June 25 to go
over issues best dealt with on a call.  I'm hoping we can enter public
review at the same point in time.  Remember, we don't have to reach
agreement on all the issues before starting public review, just be OK
as a group with putting this out for public review.

Krste






-- 
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación


WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


回复:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

Linjie Yu
 

Hi, all

    I encountered a difficulty of applying  "vrgather" instruction recently.  The details are shown blow:
    The date from source should be duplicated as pair in a upsample application. 
     Eg:  src = [0, 1, 2, 3, 4, 5, 6, 7, 8]
           dst =  [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8]
     So, my relazation is:
--------------------------------------------------------------------------------------------------
  int inex[64] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 ........,31,31};// to be compatibled of all the VLEN(128 ~ 1024)

   vfloat32m2_t data = vundefined_f32m2();
    vfloat32m1_t  zero = ( vfloat32m1_t )vmv_v_i_i32m1(0);
   while(length >0)
    {
         int gvl = vsetvl_e32m1(length);
        vuint32m2_t v_index = vle32_v_u32m2(index, gvl);
         vfloat32m1_t src_data = vle32_v_f32m1(src, gvl);
         data = vset_f32m2(data, src_data, 0);
         data  = vset_f32m2(data, zero, 1);
         vfloat32m2_t res = vrrgather_vv_f32m2(data, v_index, gvl);  
         length -=gvl;
         src += gvl;
        vse32_v_f32m2(out, res, gvl);
        dst +=gvl;
    }
-----------------------------------------------------------------------------------------------   
    As shown before, the index data should be  initialized as the max VLEN to make the code  compatibled.
So do all the applications, that need a constant.
    I think it is contrary to the idea of RISC-V, that one code can run on all the RISC-V hardware.  Does anyone have a better method ? 

Best Regards
Damon Yu


------------------原始邮件 ------------------
发件人: <tech-vector-ext@...>
发送时间:06/09/21 14:46:30
收件人: <tech-vector-ext@...>
主题:[RISC-V] [tech-vector-ext] RISC-V Vector Spec version 1.0-rc1-20210608

I've just tagged the first release candidate for v1.0 of the vector
spec in github.  PDF attached below.

I've included the TG agreed updates and handled almost all of the
outstanding issues for v1.0.  Thanks for all the feedback.

I would appreciate if folks could read through the whole thing and
give comments over email and through Github issues.  Please also
submit PRs for typos and other wording improvements.

I'd like to try and settle most concerns over email if possible, and
assume it'll take a little while for everyone to go through the doc.

I'll tentatively schedule a vector TG meeting on Friday June 25 to go
over issues best dealt with on a call.  I'm hoping we can enter public
review at the same point in time.  Remember, we don't have to reach
agreement on all the issues before starting public review, just be OK
as a group with putting this out for public review.

Krste







RISC-V Vector Spec version 1.0-rc1-20210608

Krste Asanovic
 

I've just tagged the first release candidate for v1.0 of the vector
spec in github. PDF attached below.

I've included the TG agreed updates and handled almost all of the
outstanding issues for v1.0. Thanks for all the feedback.

I would appreciate if folks could read through the whole thing and
give comments over email and through Github issues. Please also
submit PRs for typos and other wording improvements.

I'd like to try and settle most concerns over email if possible, and
assume it'll take a little while for everyone to go through the doc.

I'll tentatively schedule a vector TG meeting on Friday June 25 to go
over issues best dealt with on a call. I'm hoping we can enter public
review at the same point in time. Remember, we don't have to reach
agreement on all the issues before starting public review, just be OK
as a group with putting this out for public review.

Krste


Re: Smaller embedded version of the Vector extension

Bruce Hoult
 

On Fri, Jun 4, 2021 at 8:09 AM Zalman Stern via lists.riscv.org <zalman=google.com@...> wrote:
If the minimum VLEN is at least 128-bits, one can translate NEON/SSE intrinsics directly without having to have every vector instruction dominated by a loop over the vector length.

This is an excellent point, but there are only 8 SSE/AVX/AVX2 registers in 32 bit mode and 16 in 64 bit.

Therefore a 32 bit RISC-V could use 32 bit VLEN and LMUL=4 to directly translate SSE code without stripmining, and a 64 bit RISC-V could use 64 bit VLEN and LMUL=2. For AVX/AVX2 VLEN=64 is required on 32 bit and VLEN=128 on 64 bit, using the same LMUL.

Similarly, 32 bit ARM NEON works as sixteen 128 bit registers or thirty two 64 bit registers. Thus a 32 bit RISC-V with VLEN=64 can directly translate NEON code using LMUL=1 or LMUL=2.

Aarch64 has thirty two registers of 128 bits each, which can also be treated as thirty two registers of 64 bits each (effectively just setting a smaller VL, the upper half is zeroed). So directly porting 64 bit ARM Advanced SIMD code does require 128 bit registers.

For maximum SIMD-porting compatibility with both ARM and x86 code a 64 bit RISC-V needs VLEN=128 but a 32 bit RISC-V is fine with VLEN=64.


Re: Smaller embedded version of the Vector extension

Krste Asanovic
 

If there was no cost, then supporting VLEN=64 on general apps
processor profile would be a good thing to do. But not allowing
standard software to assume VLEN>=128 imposes a non-trivial impact on
bigger cores, and expectation is the vast majority of apps cores will
want VLEN>=128.

As Zalman points out, the main advantage is removing stripmining code
when it is known vectors will fit, and translating existing code is
one important such use case though not the only one. Removing
stripmining reduces static and dynamic code size and increases
performance. While LMUL>1 allows more cases to be handled without
stripmining, it also reduces available arch registers.

Anyone can of course still build a compatible apps processor with
VLEN=64, but this would fail to run some of the code written for
VLEN>=128 case. And almost anything goes in embedded space.

Krste

On Thu, 3 Jun 2021 13:35:03 -0700, Zalman Stern <zalman@google.com> said:
| "...if written correctly" is precisely the point. If VLEN is specified as >=128, code that targets 128-bits explicitly by
| setting VL to an appropriate constant for a large swath *is* correct. This allows one to do basically what NEON/SSE do today as
| a baseline for performance.

| Whether this is worthwhile or not may be debated, but insisting that everything should be completely vector length agnostic or
| it is broken is missing the point. Ideally there would be a lot more quantitative data on this, but I'm not going to tilt at
| that windmill right now. The worst case for the overhead of hardware vector length independence occurs at the smallest sizes as
| well.

| In general it's pretty dubious that the same set of fully lowered instruction bits can efficiently cover everything from the
| bottom of the embedded space to HPC. Ideally we'd be moving to more sophisticated lowering -- e.g. dynamic and multi-stage
| compilation -- rather than forcing the issue in the ISA design.

| Another way to go would be to split 32-bit and 64-bit implementations such that the VLEN >= 64 for 32-bit implementations and
| VLEN >= 128 for 64-bit ones. (Application code is rarely going to target 32-bit these days. Minimal embedded implementations
| are probably 32-bit.) Though truth be told, code likely needs a scalar fallback anyway unless the V extension is required.
| (Which it almost certainly won't be if we're talking embedded space.) As such, VLEN not being large enough for the expectations
| code was compiled to is the same as not having the vector unit.

| -Z-

| On Thu, Jun 3, 2021 at 9:33 AM Tony Cole via lists.riscv.org <tony.cole=huawei.com@lists.riscv.org> wrote:

| Software should still work with VLEN>=64 if written correctly, as it should be VLEN agnostic.
| Maybe it should be a recommendation that VLEN>=128, with a minimum of 64 for app processors?

| Lower performance is an implementation cost/benefit decision.

| Tony

| -----Original Message-----
| From: tech-vector-ext@lists.riscv.org [mailto:tech-vector-ext@lists.riscv.org] On Behalf Of Krste Asanovic
| Sent: 03 June 2021 17:24
| To: Guy Lemieux <guy.lemieux@gmail.com>
| Cc: Andrew Waterman <andrew@sifive.com>; Tariq Kurd <tariq.kurd@huawei.com>; Shaofei (B) <shaofei1@hisilicon.com>;
| tech-vector-ext@lists.riscv.org
| Subject: Re: [RISC-V] [tech-vector-ext] Smaller embedded version of the Vector extension

|| On Jun 3, 2021, at 9:16 AM, Guy Lemieux <guy.lemieux@gmail.com> wrote:
||
|| What is the advantage to RVV requiring VLEN >= 128?
||
|| I think this should be changed to VLEN >= 64 because:
||
|| 1) VLEN = 64 is more likely for small implementations; creating a
|| mandatory expectation to improve software portability

| This is the requirement for app processors, which are not generally small cores.
| Most competing SIMD extensions are at least 128b per vector register.

||
|| 2) two implementations, each with VLEN >= 64, do not expose anything
|| new to software that is not already exposed by VLEN >= 128
||
|| 3) allowing VLEN =32 would expose something new to software (register
|| file data layout when SEW=64)
||
|| 4) are there any disadvantages to VLEN >= 64 (versus the current VLEN
||| = 128)? (I can't see any)

| Lower performance on codes that work well on other app architectures.

| Krste

||
|| Guy
||
||
|| On Wed, Jun 2, 2021 at 11:13 AM <krste@berkeley.edu> wrote:
|||
|||
||| The VLEN>=128 constraint is only for the application processor "V"
||| extension for the app profile - not for embedded vectors which can
||| have VLEN=32.
|||
||| From spec Introduction:
||| '
||| The term base vector extension is used informally to describe the standard set of vector ISA components that will be
| required for the single-letter "V" extension, which is intended for use in standard server and application-processor
| platform profiles. The set of mandatory instructions and supported element widths will vary with the base ISA (RV32I,
| RV64I) as described below.
|||
||| Other profiles, including embedded profiles, may choose to mandate only subsets of these extensions. The exact set of
| mandatory supported instructions for an implementation to be compliant with a given profile will only be determined when
| each profile spec is ratified. For convenience in defining subset profiles, vector instruction subsets are given ISA string
| names beginning with the "Zv" prefix.
||| '
|||
||| There are a set Zve* names for the embedded subsets (see github issue
||| #550).
|||
||| A minimal embedded implementaton using RV32E+Zfinx+vectors would be
||| same state size as ARM MVE.
|||
||| P extension does not have floating-point, but for short
||| integer/fixed-point SIMD makes sense as alternative.
|||
||| The software fragmentation issue is that some library routines that
||| expose VLEN might not be portable between app cores and embedded
||| cores, but these are different software ecosystems (e.g. ABI/calling
||| convention might be different) and only a few kinds of routine rely
||| on VLEN.
|||
||| For app cores that can afford VLEN>=128, the advantage is the removal
||| of stripmining code in cases that operate on fixed-size vectors.
|||
||| Krste
|||
|||
|||
|||||||| On Wed, 2 Jun 2021 05:10:32 -0700, "Guy Lemieux" <guy.lemieux@gmail.com> said:
|||
||| | Allowing VLEN<128 would allow for smaller vector register files,
||| | bit it would also result in a profile that is not
||| | forward-compatible with the V spec. This would produce another fracture the software ecosystem.
|||
||| | To avoid such a fracture, there are two choices:
||| | (1) go with P instead
||| | (2) relax the V spec to allow smaller implementations
|||
||| | So the key question for this group is whether to relax the minimum
||| | VLEN to 32 or 64?
|||
||| | note: a possible justification for keeping 128 might be to
||| | recommend (1) instead. I don’t know anything about P, but it seems
||| | like it could be speced in a way that is competitive/comparable with Helium.
|||
||| | Guy
|||
||| | PS — I have started to design an “RVV-lite” profile which would be
||| | more amenable to embedded implementations. However, I have adopted
||| | a stance that it must remain forward compatible with the full V
||| | spec, so I have not considered VLEN below 128. I am happy to share
||| | my work on this and involve other contributors — email me if you would like to see a copy.
|||
||| | On Wed, Jun 2, 2021 at 3:15 AM Andrew Waterman <andrew@sifive.com> wrote:
|||
||| |     The uppercase-V V extension is meant to cater to apps processors, where
||| |     the VLEN >= 128 constraint is not inappropriate and is sometimes
||| |     beneficial.  But there's nothing fundamental about the ISA design that
||| |     prohibits VLEN < 128.  A minimal configuration is VLEN=ELEN=32, giving the
||| |     same total amount of state as MVE.  (And if you set LMUL=4, then you even
||| |     get the same shape: 8 registers of 128 bits apiece.)
|||
||| |     Such a thing wouldn't be called V, but perhaps something like Zvmin.
||| |     Other than agreeing on a feature set and assigning it a name, the
||| |     architecting is already done.
|||
||| |     (If you search the spec for Zfinx, you'll see that a Zfinx variant is
||| |     planned, but only barely sketched out.)
|||
||| |     On Wed, Jun 2, 2021 at 3:04 AM Tariq Kurd via lists.riscv.org <tariq.kurd=
||| |     huawei.com@lists.riscv.org> wrote:
|||
||| |         Hi everyone,
|||
||| |
|||
||| |         Are there any plans for a cut-down configuration of the vector
||| |         extension suitable for embedded cores? It seems that the 32x128-bit
||| |         register file is suitable for application class cores but it very
||| |         large for embedded cores, especially if
|||
||| |         the F registers also need to be implemented (which I think is the
||| |         case, unless a Zfinx version is specified).
|||
||| |
|||
||| |         ARM MVE only has 8x128-bit registers for FP and Vector, so it much
||| |         more suitable for embedded applications.
|||
||| |         https://en.wikichip.org/wiki/arm/helium
|||
||| |
|||
||| |         What’s the approach here? Should embedded applications implement the
||| |         P-extension instead?
|||
||| |
|||
||| |         Tariq
|||
||| |
|||
||| |         Tariq Kurd
|||
||| |         Processor Design
|||
||| |         I RISC-V Cores, Bristol
|||
||| |         E-mail:
|||
||| |         Tariq.Kurd@Huawei.com
|||
||| |         Company:
|||
||| |         Huawei technologies R&D (UK) Ltd
|||
||| |         I Address: 290
|||
||| |         Park Avenue, Aztec West, Almondsbury, Bristol, Avon, BS32
||| |         4TR, UK
|||
||| |
|||
||| |         315px-Huawei
|||
||| |         http://www.huawei.com
|||
||| |         cid:image002.jpg@01D4BC65.4BB52AF0
|||
||| |         This e-mail and its attachments contain confidential information from
||| |         HUAWEI, which
|||
||| |         is intended only for the person or entity whose address is listed
||| |         above. Any use of the information contained herein in any way
||| |         (including, but not limited to, total or partial
||| |         disclosure,reproduction, or dissemination) by persons other than the
||| |         intended recipient(s)
|||
||| |         is prohibited. If you receive this e-mail in error, please notify the
||| |         sender by phone or email immediately and delete it !
|||
||| |         本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人
||| |         或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复
||| |         制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知
||| |         发件人并删除本邮件!
|||
||| |
|||
||| |
||| | x[DELETED ATTACHMENT image001.png, PNG image] x[DELETED ATTACHMENT
||| | image002.jpg, JPEG image]

|


Re: Smaller embedded version of the Vector extension

Guy Lemieux
 

On Thu, Jun 3, 2021 at 1:08 PM Zalman Stern <zalman@google.com> wrote:

If the minimum VLEN is at least 128-bits, one can translate NEON/SSE intrinsics directly without having to have every vector instruction dominated by a loop over the vector length.
that's pretty handy, actually. I'm not sure it should be a property of
the V spec itself, rather it could be a requirement that software
which is translated in this method could require an implementation
with VLEN >= 128 else it would fall back to a scalar translation.

for RVV, I was pretty comfortable with the requirement that RVV
require VLEN >= 128 before this whole thread started. it seemed like a
good length (4 x 32b words) which matched other SIMD instructions sets
as you have noted.

with this post, Tariq indicated that he wants to reduce the amount of
state. from this, I started to think it might be better to shorten
this to VLEN >= 64 or perhaps VLEN >= max(XLEN,FLEN) rather than
reducing the number of named registers [*]

Regarding performance, VLEN=32 or 64 seems ridiculously small until
you consider register grouping. The RVV-lite profile that I'm
proposing would require SEW/LMUL=8, so VLMAX=4 for VLEN=32, and
VLMAX=8 for VLEN=64. These are reasonable vector lengths to get
reasonable amounts of parallelism.


[*] why not just restrict small implementations to 16 or 8 named
registers with VLEN >= 128? it is a consequence of how RVV has chosen
to implement widening and narrowing instructions, which require using
register grouping. in my RVV-lite profile, I considered eliminating
register groups entirely, but this would require some other way to do
widening/narrowing which would not be compatible with RVV. with
SEW/LMUL=32/4, a common setting, there are only 8 vector registers
available. to save register file area, restricting this to just 4
vector registers seems too restrictive. instead, I think relaxing
VLMAX >= 64 achieves the same effect (halving the required register
file size) without requiring such a restriction.

Guy


Re: Smaller embedded version of the Vector extension

Zalman Stern
 

"...if written correctly" is precisely the point. If VLEN is specified as >=128, code that targets 128-bits explicitly by setting VL to an appropriate constant for a large swath *is* correct. This allows one to do basically what NEON/SSE do today as a baseline for performance.

Whether this is worthwhile or not may be debated, but insisting that everything should be completely vector length agnostic or it is broken is missing the point. Ideally there would be a lot more quantitative data on this, but I'm not going to tilt at that windmill right now. The worst case for the overhead of hardware vector length independence occurs at the smallest sizes as well.

In general it's pretty dubious that the same set of fully lowered instruction bits can efficiently cover everything from the bottom of the embedded space to HPC. Ideally we'd be moving to more sophisticated lowering -- e.g. dynamic and multi-stage compilation -- rather than forcing the issue in the ISA design.

Another way to go would be to split 32-bit and 64-bit implementations such that the VLEN >= 64 for 32-bit implementations and VLEN >= 128 for 64-bit ones. (Application code is rarely going to target 32-bit these days. Minimal embedded implementations are probably 32-bit.) Though truth be told, code likely needs a scalar fallback anyway unless the V extension is required. (Which it almost certainly won't be if we're talking embedded space.) As such, VLEN not being large enough for the expectations code was compiled to is the same as not having the vector unit.

-Z-

On Thu, Jun 3, 2021 at 9:33 AM Tony Cole via lists.riscv.org <tony.cole=huawei.com@...> wrote:
Software should still work with VLEN>=64 if written correctly, as it should be VLEN agnostic.
Maybe it should be a recommendation that VLEN>=128, with a minimum of 64 for app processors?

Lower performance is an implementation cost/benefit decision.

Tony

-----Original Message-----
From: tech-vector-ext@... [mailto:tech-vector-ext@...] On Behalf Of Krste Asanovic
Sent: 03 June 2021 17:24
To: Guy Lemieux <guy.lemieux@...>
Cc: Andrew Waterman <andrew@...>; Tariq Kurd <tariq.kurd@...>; Shaofei (B) <shaofei1@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Smaller embedded version of the Vector extension



> On Jun 3, 2021, at 9:16 AM, Guy Lemieux <guy.lemieux@...> wrote:
>
> What is the advantage to RVV requiring VLEN >= 128?
>
> I think this should be changed to VLEN >= 64 because:
>
> 1) VLEN = 64 is more likely for small implementations; creating a
> mandatory expectation to improve software portability

This is the requirement for app processors, which are not generally small cores.
Most competing SIMD extensions are at least 128b per vector register.

>
> 2) two implementations, each with VLEN >= 64, do not expose anything
> new to software that is not already exposed by VLEN >= 128
>
> 3) allowing VLEN =32 would expose something new to software (register
> file data layout when SEW=64)
>
> 4) are there any disadvantages to VLEN >= 64 (versus the current VLEN
>> = 128)? (I can't see any)

Lower performance on codes that work well on other app architectures.

Krste

>
> Guy
>
>
> On Wed, Jun 2, 2021 at 11:13 AM <krste@...> wrote:
>>
>>
>> The VLEN>=128 constraint is only for the application processor "V"
>> extension for the app profile - not for embedded vectors which can
>> have VLEN=32.
>>
>> From spec Introduction:
>> '
>> The term base vector extension is used informally to describe the standard set of vector ISA components that will be required for the single-letter "V" extension, which is intended for use in standard server and application-processor platform profiles. The set of mandatory instructions and supported element widths will vary with the base ISA (RV32I, RV64I) as described below.
>>
>> Other profiles, including embedded profiles, may choose to mandate only subsets of these extensions. The exact set of mandatory supported instructions for an implementation to be compliant with a given profile will only be determined when each profile spec is ratified. For convenience in defining subset profiles, vector instruction subsets are given ISA string names beginning with the "Zv" prefix.
>> '
>>
>> There are a set Zve* names for the embedded subsets (see github issue
>> #550).
>>
>> A minimal embedded implementaton using RV32E+Zfinx+vectors would be
>> same state size as ARM MVE.
>>
>> P extension does not have floating-point, but for short
>> integer/fixed-point SIMD makes sense as alternative.
>>
>> The software fragmentation issue is that some library routines that
>> expose VLEN might not be portable between app cores and embedded
>> cores, but these are different software ecosystems (e.g. ABI/calling
>> convention might be different) and only a few kinds of routine rely
>> on VLEN.
>>
>> For app cores that can afford VLEN>=128, the advantage is the removal
>> of stripmining code in cases that operate on fixed-size vectors.
>>
>> Krste
>>
>>
>>
>>>>>>> On Wed, 2 Jun 2021 05:10:32 -0700, "Guy Lemieux" <guy.lemieux@...> said:
>>
>> | Allowing VLEN<128 would allow for smaller vector register files,
>> | bit it would also result in a profile that is not
>> | forward-compatible with the V spec. This would produce another fracture the software ecosystem.
>>
>> | To avoid such a fracture, there are two choices:
>> | (1) go with P instead
>> | (2) relax the V spec to allow smaller implementations
>>
>> | So the key question for this group is whether to relax the minimum
>> | VLEN to 32 or 64?
>>
>> | note: a possible justification for keeping 128 might be to
>> | recommend (1) instead. I don’t know anything about P, but it seems
>> | like it could be speced in a way that is competitive/comparable with Helium.
>>
>> | Guy
>>
>> | PS — I have started to design an “RVV-lite” profile which would be
>> | more amenable to embedded implementations. However, I have adopted
>> | a stance that it must remain forward compatible with the full V
>> | spec, so I have not considered VLEN below 128. I am happy to share
>> | my work on this and involve other contributors — email me if you would like to see a copy.
>>
>> | On Wed, Jun 2, 2021 at 3:15 AM Andrew Waterman <andrew@...> wrote:
>>
>> |     The uppercase-V V extension is meant to cater to apps processors, where
>> |     the VLEN >= 128 constraint is not inappropriate and is sometimes
>> |     beneficial.  But there's nothing fundamental about the ISA design that
>> |     prohibits VLEN < 128.  A minimal configuration is VLEN=ELEN=32, giving the
>> |     same total amount of state as MVE.  (And if you set LMUL=4, then you even
>> |     get the same shape: 8 registers of 128 bits apiece.)
>>
>> |     Such a thing wouldn't be called V, but perhaps something like Zvmin.
>> |     Other than agreeing on a feature set and assigning it a name, the
>> |     architecting is already done.
>>
>> |     (If you search the spec for Zfinx, you'll see that a Zfinx variant is
>> |     planned, but only barely sketched out.)
>>
>> |     On Wed, Jun 2, 2021 at 3:04 AM Tariq Kurd via lists.riscv.org <tariq.kurd=
>> |     huawei.com@...> wrote:
>>
>> |         Hi everyone,
>>
>> |
>>
>> |         Are there any plans for a cut-down configuration of the vector
>> |         extension suitable for embedded cores? It seems that the 32x128-bit
>> |         register file is suitable for application class cores but it very
>> |         large for embedded cores, especially if
>>
>> |         the F registers also need to be implemented (which I think is the
>> |         case, unless a Zfinx version is specified).
>>
>> |
>>
>> |         ARM MVE only has 8x128-bit registers for FP and Vector, so it much
>> |         more suitable for embedded applications.
>>
>> |         https://en.wikichip.org/wiki/arm/helium
>>
>> |
>>
>> |         What’s the approach here? Should embedded applications implement the
>> |         P-extension instead?
>>
>> |
>>
>> |         Tariq
>>
>> |
>>
>> |         Tariq Kurd
>>
>> |         Processor Design
>>
>> |         I RISC-V Cores, Bristol
>>
>> |         E-mail:
>>
>> |         Tariq.Kurd@...
>>
>> |         Company:
>>
>> |         Huawei technologies R&D (UK) Ltd
>>
>> |         I Address: 290
>>
>> |         Park Avenue, Aztec West, Almondsbury, Bristol, Avon, BS32
>> |         4TR, UK
>>
>> |
>>
>> |         315px-Huawei
>>
>> |         http://www.huawei.com
>>
>> |         cid:image002.jpg@...
>>
>> |         This e-mail and its attachments contain confidential information from
>> |         HUAWEI, which
>>
>> |         is intended only for the person or entity whose address is listed
>> |         above. Any use of the information contained herein in any way
>> |         (including, but not limited to, total or partial
>> |         disclosure,reproduction, or dissemination) by persons other than the
>> |         intended recipient(s)
>>
>> |         is prohibited. If you receive this e-mail in error, please notify the
>> |         sender by phone or email immediately and delete it !
>>
>> |         本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人
>> |         或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复
>> |         制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知
>> |         发件人并删除本邮件!
>>
>> |
>>
>> |
>> | x[DELETED ATTACHMENT image001.png, PNG image] x[DELETED ATTACHMENT
>> | image002.jpg, JPEG image]











41 - 60 of 696