Proposal for accelerating nested virtualization on RISC-V


Anup Patel
 

A clarification is required in RISC-V H-Extension spec regarding scope
of HSTATUS.VTVM bit. Currently as-per the spec, all virtual memory
management instructions (both SFENCEs and HFENCEs) will trap to
HS-mode when HSTATUS.VTVM == 1 and V == 1. Rather, only SFENCEs are
required to be trapped to HS-mode when HSTATUS.VTVM == 1 and V == 1
because HFENCEs are only defined for HS-mode (i.e. V==0).

To better describe nested virtualization, we define following dummy
privilege modes:
Host HS-mode
Host hypervisor kernel will run in this mode
Software in this mode will actually run in HW HS-mode
Host U-mode
Host hypervisor user-space will run in this mode
Software in this mode will actually run in HW U-mode
Guest HS-mode
Guest hypervisor kernel will run in this mode
Software in this mode will actually run in HW VS-mode
Guest U-mode => HW VU-mode
Guest hypervisor user-space will run in this mode
Software in this mode will actually run in HW VU-mode
Guest VS-mode => HW VS-mode
Software in this mode will actually run in HW VS-mode
Guest VU-mode => HW VU-mode
Software in this mode will actually run in HW VU-mode

A high-level software approach for nested virtualization in RISC-V
can be as follows:
1. The Host HS-mode (Host hypervisor) will enable HSTATUS.VTSR to
emulate SRET instruction for Guest. This emulation will involve
a CSR world-switch when switching from Guest HS/U-mode to/from
Guest VS/VU-mode.
2. Virtual interrupts will be injected to Guest VS/VU-mode after
doing CSR world-switch (in point1 above) from Guest HS/U-mode
to Guest VS/VU-mode.
3. All accesses to "h<xyz>" and "vs<xyz>" from Guest will trap to
Host HS-mode (Host hypervisor) where:
a) These CSRs will emulated for Guest HS-mode
b) For Guest U-mode and Guest VS/VU-mode, the trap will
be forwarded to Guest HS-mode
4. The Host HS-mode (Host hypervisor) will manage two Stage2 page
tables:
a) Regular Stage2 page table for Guest HS/U-mode
b) Shadow Stage2 page table for Guest VS/VU-mode. Of course,
Host HS-mode (host hypervisor) will have to do software walk
of Guest HS-mode HGATP page table when populating mappings in
Shadow Stage2 page table and it will have mappings which are
combined effect of Guest HS-mode HGATP page table and Regular
Stage2 page table.
5. All HFENCEs will trap to Host HS-mode where the Host HS-mode
(Host hypervisor) will:
a) Trap-n-emulate HFENCE.VVMA and HFENCE.GVMA for Guest HS-mode
b) Redirect HFENCE.VVMA and HFENCE.GVMA traps from Guest VS-mode
to Guest HS-mode irrespective to Guest HS-mode HSTATUS.VTVM
6. All HLV/HSV instructions from Guest HS/U-mode and Guest VS/VU-mode
will trap to Host HS-mode (Host hypervisor) where:
a) HLV/HSV instruction from Guest HS/U-mode will be emulated
by Host HS-mode (Host hypervisor)
b) HLV/HSV instruction from Guest VS/VU-mode will be forwarded
to Guest HS-mode by Host HS-mode (Host hypervisor)

Please suggest if any case is not considered in above high-level
software approach for nested virtualization.

Based on above high-level software approach, we propose a way to
accelerate nested virtualization performance by reducing "h<xyz>" and
"vs<xyz>" CSR access traps from VS-mode to HS-mode (point3 above).

As-per our proposal, we convert "h<xyz>" and "vs<xyz>" CSR accesses
From VS-mode as memory accesses relative to a nested context base
(or <nested_context_base>).

The enable bit (or <nested_enable>) for above described CSR accesses
conversion and the <nested_context_base> can be specified via new
HNESTED CSR.

<nested_enable> = HNESTED[0]
<nested_context_base> = HNESTED[XLEN:1] << (log2 (XLEN / 8))

Note: <nested_context_base> address is always machine word aligned
Note: <nested_enable> = 0 means "h<xyz>" and "vs<xyz>" trap to HS-mode
without any CSR accesses conversion

Various "h<xyz>" and "vs<xyz>" CSRs are accessed at <csr_nested_offset>
relative to <nested_context_base> based on their CSR number as follows:

CSR number 0x2xx
<csr_nested_offset> = 0x0000 + ((CSR_number & 0xff) * (XLEN / 8))
CSR number 0x6xx
<csr_nested_offset> = 0x1000 + ((CSR_number & 0xff) * (XLEN / 8))
CSR number 0xAxx
<csr_nested_offset> = 0x2000 + ((CSR_number & 0xff) * (XLEN / 8))
CSR number 0xExx
<csr_nested_offset> = 0x3000 + ((CSR_number & 0xff) * (XLEN / 8))

The VS-mode accesses to some of the "h<xyz>" CSRs cannot be converted
into memory accesses due to nature of these CSRs. These CSRs include
HGEIP and HGEIE CSRs (any other CSRs ??).

Accesses to the HNESTED CSR (described above) from VS-mode is also
converted to memory access when <nested_enable> = 1 because the
HNESTED CSR can be safely emulated using nested acceleration.

Best Regards,
Anup Patel

Join tech-privileged@lists.riscv.org to automatically receive all group messages.