Re: Vector TG meeting minutes 2020/4/03


Thang Tran
 

In scalar code, there is always signed/zero extension for the data and alignment. I do not see a different with vector load/store. If alignment is needed, not much additional cost for signed/zero extension, and an extra pipeline stage is added.

 

Depended on how the load is pipelined, the load-to-use penalty may be none. So, widening is much preferred in our design.

 

Thanks, Thang

 

From: tech-vector-ext@... [mailto:tech-vector-ext@...] On Behalf Of Alex Solomatnikov
Sent: Saturday, April 4, 2020 7:09 PM
To: Thang Tran <thang@...>
Cc: Nick Knight <nick.knight@...>; Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Vector TG meeting minutes 2020/4/03

 

 

Bob Dreyer said he would share an example code.

 

Do you really have a 2x or 4x wider write port to the vector register file to make vlb and the like work at full memory bandwidth?

 

If yes, what is the impact on PPA, i.e. clock frequency, area, power?

 

If not, then extra widening instruction would not matter because vlb itself is the bottleneck.

 

Alex

 

On Sat, Apr 4, 2020 at 5:19 PM Thang Tran <thang@...> wrote:

Hi Nick,

 

It is confidential customer application code.

 

Thanks, Thang

 

From: Nick Knight [mailto:nick.knight@...]
Sent: Saturday, April 4, 2020 5:04 PM
To: Thang Tran <thang@...>
Cc: Krste Asanovic <krste@...>; tech-vector-ext@...
Subject: Re: [RISC-V] [tech-vector-ext] Vector TG meeting minutes 2020/4/03

 

Hi Thang,

 

Can you, and anyone else who responds, please be concrete about the applications you have in mind? I tried to do so in my email.

 

In my opinion, concrete examples are crucial to making an informed decision. I hope you agree.

 

Best,

Nick Knight

 

On Sat, Apr 4, 2020 at 4:56 PM Thang Tran <thang@...> wrote:

There are real application (mixed integer/FP - convert instruction is used) codes written with load/store byte/halfword/word. There is a huge performance impact by adding widening instruction in a small critical loop where every additional instruction causes > 10% impact on performance.

I am strongly against dropping the byte/halfword/word for load/store.

Thanks, Thang

-----Original Message-----
From: tech-vector-ext@... [mailto:tech-vector-ext@...] On Behalf Of Krste Asanovic
Sent: Saturday, April 4, 2020 1:43 PM
To: tech-vector-ext@...
Subject: [RISC-V] [tech-vector-ext] Vector TG meeting minutes 2020/4/03


Date: 2020/4/03
Task Group: Vector Extension
Chair: Krste Asanovic
Number of Attendees: ~15
Current issues on github: https://github.com/riscv/riscv-v-spec

Issues discussed: #354/362

The following issues were discussed.

Closing on version v0.9. A list of proposed changes to form v0.9 were presented.  The main dispute was around dropping byte/halfword/word vector load/stores.

#354/362 Drop byte/halfword/word vector load/stores

Most of the meeting time was spent discussing this issue, which was contentious.

Participants in favor of retaining these instructions were concerned about the code size and performance impact of dropping them.
Proponents in favor of dropping them noted that the main impact was only for integer code (floating-point code does not benefit from these instructions), that performance might be lower using these instructions rather than widening, and that there was a large benefit in reducing memory pipeline complexity.  The group was going to consider some examples to be supplied by the members, including some mixed floating-point/integer code.

Discussion to continue on mailing list.




Join {tech-vector-ext@lists.riscv.org to automatically receive all group messages.