On Mon, Nov 22, 2021 at 5:23 PM 洛佳 Luo Jia <me@...> wrote:
RISC-V SBI provides an interface via enironment call to provide platform agnostic features, which is similiar to operating system's syscalls. According to current SBI specification, this environment call feature is same as RISC-V function calls except (1) control transfer instruction is `ecall` other than `call`, (2) extension ID and function (if applicable) ID should be stored into `a7` and `a6` respectively. Current SBI implementations implement this feature by trap handlers, it by now always store all callee saved registers after stack swap, and jump to the actual handler function which follows RISC-V standard function call ABI to perform the SBI feature.
This process is conventional but could loss some of its performance when we target to extreme performance. It always saves all callee saved registers whenever the handler function need or don't need to use them. When the handler does not use the saved registers, it result in redundant register saves and reloads.
If it's possible should we define a complete SBI-call calling convention to solve this problem, by these following features:
- caller always store parameters in a0-a7 registers without touching the stack, even if parameter count is greater than two
- caller transfer instruction using `ecall`
- callee store all the registers other than a0-a7, x0, sp, tp, gp if used by function as temporary value, so that for compability caller do not need to save them (specifically, ra and t0-t6 if used should be preserved and saved by callee in this calling convention)
- callee will use `ret` instruction to return to interrupt handle wrapper (any other good ideas on this?)
- caller will use a0-a1 registers as return value, return value does not come from stack
Another issue comes from incompleteness of current definition of SBI calling convention. The version 0.3 SBI spec only defines how to transfer control and fill in parameters, but how other registers should be saved remains undefined as is mentioned in https://github.com/riscv-non-isa/riscv-sbi-doc/issues/35 . To make it complete we reduce ambiguity, and it would be a standard developers may refer to when implementing them in higher programming languages like Rust or C.
To raise an example of how code is compiled under this calling convention, I wrote a gist page for this idea: https://gist.github.com/luojia65/e6a31563d99aa566a8c61ffd6bbc2d70 .
By saving only used registers, we get around from always having to save and reload all registers. It saves time when we need exterme performance on SBI calls. If possible, this SBI-call calling convention should be added to RISC-V SBI specification to make current SBI-call definition complete.
The SBI calling convention only defines uses of a0-a7 registers for
parameter passing and return value because all other registers are
assumed to be saved/restored (i.e. preserved) by the SBI
implementation (M-mode firmware or Hypervisor). This is based on the
1) The SBI caller could be really anything. It could be some
bare-metal application with it's own register calling convention or it
could be some proprietary OS with proprietary ABI. This means SBI
implementation cannot make any assumptions about the callers (i.e. SBI
2) Most SBI implementations (OpenSBI, BBL, etc) are also responsible
for emulating missing HW functionality (such as emulating misaligned
load/store, time CSR, htimedelta CSR, CMO instructions) and the
trapping instructions could be touching any register (tX or aX or sX).
Further in RISC-V, we have the same mtvec CSR for handling SBI calls,
interrupts, misaligned load/store, and illegal instructions so SBI
implementations need trap handlers to save all registers anyway. This
means SBI calling convention has to be aligned with this aspect of
RISC-V and ensure that all registers are preserved by SBI
Some aspects of point#2 above also apply to Linux kernel and that's
why we have handle_exception() of <linux>/arch/riscv/kernel/entry.S
preserving all registers.
Regarding performance of SBI calls, we have following options at the moment:
1) It is very well known that SBI IPI, Timer, and RFENCE calls can
easily be in the hot-path of many use cases/workloads. The upcoming
Sstc, AIA, and ACLINT specifications allow RISC-V platforms to totally
avoid SBI IPI, Timer, and RFENCE calls so it is upto RISC-V platforms
to decide how much they want to depend on SBI calls for fundamental HW
functionality like IPI, Timer, Rfence, etc.
2) In future, we might have standard instructions to load/store
multiple registers and SBI implementations can use such instructions
to optimize their low-level/assembly trap handlers.
Based on the above points, I think the SBI calling convention does the
right thing by only touching a0-a7 registers forcing SBI
implementations to preserve the rest of the registers.