Re: Vector TG meeting minutes 2020/4/03

Krste Asanovic

These are basic operations, not application kernels.

It's easy to call out missing instructions when considering individual

It's more important to gather and evaluate actual application kernels.


On Sat, 4 Apr 2020 23:25:13 -0400, "David Horner" <ds2horner@...> said:
| I agree Nick.
| So here is a suggestion, not completely facetiously:

| For load byte/half/word

| example when SEW = 64

| An implementation can optimize the sequence

| strided load by 1/2/4

| shift left 56/48/32

| arith right 56/48/32

| but a sign extend byte/half/word to SEW would make fusing/chaining simpler.

| And these without widening.

| For stores:

| a “pack” SEW (of byte/half/word) instruction by SLEN into appropriate LMUL=1/8, 1/4 or 1/2 would allow standard unit strided store to work.

| A fractional LMUL that uses interleave (rather than right justified SLEN chunks) would not need this pack instruction.

| On 2020-04-04 8:04 p.m., Nick Knight wrote:

| Hi Thang,

| Can you, and anyone else who responds, please be concrete about the applications you have in mind? I tried to do so in my email.

| In my opinion, concrete examples are crucial to making an informed decision. I hope you agree.

| Best,
| Nick Knight

| On Sat, Apr 4, 2020 at 4:56 PM Thang Tran <thang@...> wrote:

| There are real application (mixed integer/FP - convert instruction is used) codes written with load/store byte/halfword/word. There is a huge performance impact by adding widening instruction in a small
| critical loop where every additional instruction causes > 10% impact on performance.

| I am strongly against dropping the byte/halfword/word for load/store.

| Thanks, Thang

| -----Original Message-----
| From: tech-vector-ext@... [mailto:tech-vector-ext@...] On Behalf Of Krste Asanovic
| Sent: Saturday, April 4, 2020 1:43 PM
| To: tech-vector-ext@...
| Subject: [RISC-V] [tech-vector-ext] Vector TG meeting minutes 2020/4/03

| Date: 2020/4/03
| Task Group: Vector Extension
| Chair: Krste Asanovic
| Number of Attendees: ~15
| Current issues on github:

| Issues discussed: #354/362

| The following issues were discussed.

| Closing on version v0.9. A list of proposed changes to form v0.9 were presented.  The main dispute was around dropping byte/halfword/word vector load/stores.

| #354/362 Drop byte/halfword/word vector load/stores

| Most of the meeting time was spent discussing this issue, which was contentious.

| Participants in favor of retaining these instructions were concerned about the code size and performance impact of dropping them.
| Proponents in favor of dropping them noted that the main impact was only for integer code (floating-point code does not benefit from these instructions), that performance might be lower using these
| instructions rather than widening, and that there was a large benefit in reducing memory pipeline complexity.  The group was going to consider some examples to be supplied by the members, including some mixed
| floating-point/integer code.

| Discussion to continue on mailing list.


Join { to automatically receive all group messages.