Re: Proposal for accelerating nested virtualization on RISC-V


Anup Patel
 

Hi Jonathan,

 

All cases for CSR accesses have not been thought through (initial draft) and written out. Regarding WARL CSR with hardwired bits, the HW will always read/write fixed values of hardwired bits in memory.

 

I totally agree with you on the two times overhead of illegal instruction trap. The illegal instruction trap is not delegated to S-mode because M-mode (i.e. OpenSBI) emulates S-mode access to TIME, HTIMEDELTA, CYCLE, INSTRET, and other COUNTER CSRs. I am not sure if we can totally get rid off illegal instruction trap handling from OpenSBI because quite a few HW out there don’t have TIME CSRs and other CSRs for accessed from S-mode. Currently OpenSBI emulates TIME CSR for HS-mode, U-mode, VS-mode and VU-mode. The nested virtualization acceleration will certainly help.

 

I forgot to mention that implementation can choose to not implement nested virtualization acceleration and hardwire HNESTED CSR to zero.

 

Regards,

Anup

 

 

From: Jonathan Behrens <behrensj@...>
Sent: 17 March 2020 18:58
To: Anup Patel <Anup.Patel@...>
Cc: tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] Proposal for accelerating nested virtualization on RISC-V

 

Your description of un-accelerated nested virtualization seems workable to me. I'm less sure of the proposal to avoid trapping on h<xyz> and vs<xyz> accesses. Aren't you going to run into issues with any WARL CSR that has hardwired bits?

 

I'd like to point out another performance pitfall with trap-and-emulate that I've mentioned before but might not be obvious from reading your proposal: the illegal instruction traps triggered by the guest trying to use hypervisor CSRs or run hypervisor instructions will not trap directly to HS-mode. Rather they will be routed to M-mode and then get forwarded to HS-mode, which has about two times higher overhead (forwarding a trap is at least as expensive as emulating most instructions). It is also quite avoidable by adding a bit to let M-mode delegate traps from legal but privileged instructions executed in U/VS/VU modes.

 

Jonathan

 

On Tue, Mar 17, 2020 at 6:40 AM Anup Patel via Lists.Riscv.Org <anup.patel=wdc.com@...> wrote:

A clarification is required in RISC-V H-Extension spec regarding scope
of HSTATUS.VTVM bit. Currently as-per the spec, all virtual memory
management instructions (both SFENCEs and HFENCEs) will trap to
HS-mode when HSTATUS.VTVM == 1 and V == 1. Rather, only SFENCEs are
required to be trapped to HS-mode when HSTATUS.VTVM == 1 and V == 1
because HFENCEs are only defined for HS-mode (i.e. V==0).

To better describe nested virtualization, we define following dummy
privilege modes:
Host HS-mode
  Host hypervisor kernel will run in this mode
  Software in this mode will actually run in HW HS-mode
Host U-mode
  Host hypervisor user-space will run in this mode
  Software in this mode will actually run in HW U-mode
Guest HS-mode
  Guest hypervisor kernel will run in this mode
  Software in this mode will actually run in HW VS-mode
Guest U-mode => HW VU-mode
  Guest hypervisor user-space will run in this mode
  Software in this mode will actually run in HW VU-mode
Guest VS-mode => HW VS-mode
  Software in this mode will actually run in HW VS-mode
Guest VU-mode => HW VU-mode
  Software in this mode will actually run in HW VU-mode

A high-level software approach for nested virtualization in RISC-V
can be as follows:
1. The Host HS-mode (Host hypervisor) will enable HSTATUS.VTSR to
   emulate SRET instruction for Guest. This emulation will involve
   a CSR world-switch when switching from Guest HS/U-mode to/from
   Guest VS/VU-mode.
2. Virtual interrupts will be injected to Guest VS/VU-mode after
   doing CSR world-switch (in point1 above) from Guest HS/U-mode
   to Guest VS/VU-mode.
3. All accesses to "h<xyz>" and "vs<xyz>" from Guest will trap to
   Host HS-mode (Host hypervisor) where:
   a) These CSRs will emulated for Guest HS-mode
   b) For Guest U-mode and Guest VS/VU-mode, the trap will
      be forwarded to Guest HS-mode
4. The Host HS-mode (Host hypervisor) will manage two Stage2 page
   tables:
   a) Regular Stage2 page table for Guest HS/U-mode
   b) Shadow Stage2 page table for Guest VS/VU-mode. Of course,
      Host HS-mode (host hypervisor) will have to do software walk
      of Guest HS-mode HGATP page table when populating mappings in
      Shadow Stage2 page table and it will have mappings which are
      combined effect of Guest HS-mode HGATP page table and Regular
      Stage2 page table.
5. All HFENCEs will trap to Host HS-mode where the Host HS-mode
   (Host hypervisor) will:
   a) Trap-n-emulate HFENCE.VVMA and HFENCE.GVMA for Guest HS-mode
   b) Redirect HFENCE.VVMA and HFENCE.GVMA traps from Guest VS-mode
      to Guest HS-mode irrespective to Guest HS-mode HSTATUS.VTVM
6. All HLV/HSV instructions from Guest HS/U-mode and Guest VS/VU-mode
   will trap to Host HS-mode (Host hypervisor) where:
   a) HLV/HSV instruction from Guest HS/U-mode will be emulated
      by Host HS-mode (Host hypervisor)
   b) HLV/HSV instruction from Guest VS/VU-mode will be forwarded
      to Guest HS-mode by Host HS-mode (Host hypervisor)

Please suggest if any case is not considered in above high-level
software approach for nested virtualization.

Based on above high-level software approach, we propose a way to
accelerate nested virtualization performance by reducing "h<xyz>" and
"vs<xyz>" CSR access traps from VS-mode to HS-mode (point3 above).

As-per our proposal, we convert "h<xyz>" and "vs<xyz>" CSR accesses
From VS-mode as memory accesses relative to a nested context base
(or <nested_context_base>).

The enable bit (or <nested_enable>) for above described CSR accesses
conversion and the <nested_context_base> can be specified via new
HNESTED CSR.

<nested_enable> = HNESTED[0]
<nested_context_base> = HNESTED[XLEN:1] << (log2 (XLEN / 8))

Note: <nested_context_base> address is always machine word aligned
Note: <nested_enable> = 0 means "h<xyz>" and "vs<xyz>" trap to HS-mode
without any CSR accesses conversion

Various "h<xyz>" and "vs<xyz>" CSRs are accessed at <csr_nested_offset>
relative to <nested_context_base> based on their CSR number as follows:

CSR number 0x2xx
<csr_nested_offset> = 0x0000 + ((CSR_number & 0xff) * (XLEN / 8))
CSR number 0x6xx
<csr_nested_offset> = 0x1000 + ((CSR_number & 0xff) * (XLEN / 8))
CSR number 0xAxx
<csr_nested_offset> = 0x2000 + ((CSR_number & 0xff) * (XLEN / 8))
CSR number 0xExx
<csr_nested_offset> = 0x3000 + ((CSR_number & 0xff) * (XLEN / 8))

The VS-mode accesses to some of the "h<xyz>" CSRs cannot be converted
into memory accesses due to nature of these CSRs. These CSRs include
HGEIP and HGEIE CSRs (any other CSRs ??).

Accesses to the HNESTED CSR (described above) from VS-mode is also
converted to memory access when <nested_enable> = 1 because the
HNESTED CSR can be safely emulated using nested acceleration.

Best Regards,
Anup Patel

Join tech-privileged@lists.riscv.org to automatically receive all group messages.