Re: Calling Convention for Vector ?


Andrew Waterman
 

Providing callee-saved vector registers in the regular C calling convention might actually degrade performance, as most vector computation is done in leaf functions or in strip-mine loops that don't call functions.  Functions that want to use all the vector registers will have to spill some callee-saved registers, even if the callee-saved registers aren't providing much benefit.

By contrast, the vector millicode calling convention (for routines like element-wise transcendentals) would likely benefit from an alternate calling convention that has some callee-saved vector registers.


On Mon, Jan 13, 2020 at 12:35 AM 戎杰杰 <jiejie.rjj@...> wrote:
Hi,

 We met some problems as your mention also.

Consider some code will want args in vector regs, we study from SVE
 vregs layout and config our RISCV vregs layout as following:

 | v0-7     | v0-7     | Temporaries | Caller |
 | v8-15   | v8-15   | Function arguments/return values | Caller |
 | v16-23 | v16-23 | Function arguments | Caller |
 | v24-31 | v24-31 | Saved register | Callee |

 This configuration will fix like v0 mask reg,
 or we can use 16 registers for two arguments in 8 LMUL.
We can make a draft to improving call convention with args in vector :)


--Jojo
在 2019年12月28日 +0800 AM12:12,Jim Wilson <jimw@...>,写道:
On Thu, Dec 26, 2019 at 2:01 PM Earl Killian <earl.killian@...> wrote:
Vectors are passed in memory and returned in memory. Vectors are arbitrary length, whereas the vector registers are fixed length, and can only be used to temporarily hold a portion of a memory vector. Thus it doesn’t make sense to pass or return things in vector registers, or to have the registers saved or restored as part of the calling convention.

Some code will not want args in vector regs, so that we don't have to
save/restore them around calls. Some code will want args in vector
regs, so that they can have subroutines that operate on vectors. If
you have already loaded part of a vector into a vector register, it is
silly to send it back to memory just so you can call a function that
reads it back in. It is better to leave it in a register to reduce
memory bandwidth. So we need two calling conventions. Or
alternatively, one calling convention with optional vector support
that can be enabled only when needed. If you look at ARM SVE, you
will see that this is what they have done.

I think this is more complicated for rvv though as we have LMUL up to
8, which means we need 16 registers worst case for two arguments,
which will have to be v8-15 or v16-v23 or v24-v31 because of alignment
issues. Plus we need v0 for an optional mask so we can't use v1-v7
for arguments. And vlen will have to be an implicit argument.
Someone will have to spend time doing experiments to see how well this
works in practice to make sure it is reasonable. And we will need a
reasonable compiler first before we can do experiments, which we don't
really have yet, and may not have for a while. Not to mention
hardware to test on. I think it will be a while before we can
formally specify a vector calling convention.

Jim

Join {tech-vector-ext@lists.riscv.org to automatically receive all group messages.