Proposal for accelerating nested virtualization on RISC-V
A clarification is required in RISC-V H-Extension spec regarding scope
of HSTATUS.VTVM bit. Currently as-per the spec, all virtual memory management instructions (both SFENCEs and HFENCEs) will trap to HS-mode when HSTATUS.VTVM == 1 and V == 1. Rather, only SFENCEs are required to be trapped to HS-mode when HSTATUS.VTVM == 1 and V == 1 because HFENCEs are only defined for HS-mode (i.e. V==0). To better describe nested virtualization, we define following dummy privilege modes: Host HS-mode Host hypervisor kernel will run in this mode Software in this mode will actually run in HW HS-mode Host U-mode Host hypervisor user-space will run in this mode Software in this mode will actually run in HW U-mode Guest HS-mode Guest hypervisor kernel will run in this mode Software in this mode will actually run in HW VS-mode Guest U-mode => HW VU-mode Guest hypervisor user-space will run in this mode Software in this mode will actually run in HW VU-mode Guest VS-mode => HW VS-mode Software in this mode will actually run in HW VS-mode Guest VU-mode => HW VU-mode Software in this mode will actually run in HW VU-mode A high-level software approach for nested virtualization in RISC-V can be as follows: 1. The Host HS-mode (Host hypervisor) will enable HSTATUS.VTSR to emulate SRET instruction for Guest. This emulation will involve a CSR world-switch when switching from Guest HS/U-mode to/from Guest VS/VU-mode. 2. Virtual interrupts will be injected to Guest VS/VU-mode after doing CSR world-switch (in point1 above) from Guest HS/U-mode to Guest VS/VU-mode. 3. All accesses to "h<xyz>" and "vs<xyz>" from Guest will trap to Host HS-mode (Host hypervisor) where: a) These CSRs will emulated for Guest HS-mode b) For Guest U-mode and Guest VS/VU-mode, the trap will be forwarded to Guest HS-mode 4. The Host HS-mode (Host hypervisor) will manage two Stage2 page tables: a) Regular Stage2 page table for Guest HS/U-mode b) Shadow Stage2 page table for Guest VS/VU-mode. Of course, Host HS-mode (host hypervisor) will have to do software walk of Guest HS-mode HGATP page table when populating mappings in Shadow Stage2 page table and it will have mappings which are combined effect of Guest HS-mode HGATP page table and Regular Stage2 page table. 5. All HFENCEs will trap to Host HS-mode where the Host HS-mode (Host hypervisor) will: a) Trap-n-emulate HFENCE.VVMA and HFENCE.GVMA for Guest HS-mode b) Redirect HFENCE.VVMA and HFENCE.GVMA traps from Guest VS-mode to Guest HS-mode irrespective to Guest HS-mode HSTATUS.VTVM 6. All HLV/HSV instructions from Guest HS/U-mode and Guest VS/VU-mode will trap to Host HS-mode (Host hypervisor) where: a) HLV/HSV instruction from Guest HS/U-mode will be emulated by Host HS-mode (Host hypervisor) b) HLV/HSV instruction from Guest VS/VU-mode will be forwarded to Guest HS-mode by Host HS-mode (Host hypervisor) Please suggest if any case is not considered in above high-level software approach for nested virtualization. Based on above high-level software approach, we propose a way to accelerate nested virtualization performance by reducing "h<xyz>" and "vs<xyz>" CSR access traps from VS-mode to HS-mode (point3 above). As-per our proposal, we convert "h<xyz>" and "vs<xyz>" CSR accesses From VS-mode as memory accesses relative to a nested context base (or <nested_context_base>). The enable bit (or <nested_enable>) for above described CSR accesses conversion and the <nested_context_base> can be specified via new HNESTED CSR. <nested_enable> = HNESTED[0] <nested_context_base> = HNESTED[XLEN:1] << (log2 (XLEN / 8)) Note: <nested_context_base> address is always machine word aligned Note: <nested_enable> = 0 means "h<xyz>" and "vs<xyz>" trap to HS-mode without any CSR accesses conversion Various "h<xyz>" and "vs<xyz>" CSRs are accessed at <csr_nested_offset> relative to <nested_context_base> based on their CSR number as follows: CSR number 0x2xx <csr_nested_offset> = 0x0000 + ((CSR_number & 0xff) * (XLEN / 8)) CSR number 0x6xx <csr_nested_offset> = 0x1000 + ((CSR_number & 0xff) * (XLEN / 8)) CSR number 0xAxx <csr_nested_offset> = 0x2000 + ((CSR_number & 0xff) * (XLEN / 8)) CSR number 0xExx <csr_nested_offset> = 0x3000 + ((CSR_number & 0xff) * (XLEN / 8)) The VS-mode accesses to some of the "h<xyz>" CSRs cannot be converted into memory accesses due to nature of these CSRs. These CSRs include HGEIP and HGEIE CSRs (any other CSRs ??). Accesses to the HNESTED CSR (described above) from VS-mode is also converted to memory access when <nested_enable> = 1 because the HNESTED CSR can be safely emulated using nested acceleration. Best Regards, Anup Patel |
|
Jonathan Behrens <behrensj@...>
Your description of un-accelerated nested virtualization seems workable to me. I'm less sure of the proposal to avoid trapping on h<xyz> and vs<xyz> accesses. Aren't you going to run into issues with any WARL CSR that has hardwired bits? I'd like to point out another performance pitfall with trap-and-emulate that I've mentioned before but might not be obvious from reading your proposal: the illegal instruction traps triggered by the guest trying to use hypervisor CSRs or run hypervisor instructions will not trap directly to HS-mode. Rather they will be routed to M-mode and then get forwarded to HS-mode, which has about two times higher overhead (forwarding a trap is at least as expensive as emulating most instructions). It is also quite avoidable by adding a bit to let M-mode delegate traps from legal but privileged instructions executed in U/VS/VU modes. Jonathan A clarification is required in RISC-V H-Extension spec regarding scope |
|
Hi Jonathan,
All cases for CSR accesses have not been thought through (initial draft) and written out. Regarding WARL CSR with hardwired bits, the HW will always read/write fixed values of hardwired bits in memory.
I totally agree with you on the two times overhead of illegal instruction trap. The illegal instruction trap is not delegated to S-mode because M-mode (i.e. OpenSBI) emulates S-mode access to TIME, HTIMEDELTA, CYCLE, INSTRET, and other COUNTER CSRs. I am not sure if we can totally get rid off illegal instruction trap handling from OpenSBI because quite a few HW out there don’t have TIME CSRs and other CSRs for accessed from S-mode. Currently OpenSBI emulates TIME CSR for HS-mode, U-mode, VS-mode and VU-mode. The nested virtualization acceleration will certainly help.
I forgot to mention that implementation can choose to not implement nested virtualization acceleration and hardwire HNESTED CSR to zero.
Regards, Anup
From: Jonathan Behrens <behrensj@...>
Sent: 17 March 2020 18:58 To: Anup Patel <Anup.Patel@...> Cc: tech-privileged@... Subject: Re: [RISC-V] [tech-privileged] Proposal for accelerating nested virtualization on RISC-V
Your description of un-accelerated nested virtualization seems workable to me. I'm less sure of the proposal to avoid trapping on h<xyz> and vs<xyz> accesses. Aren't you going to run into issues with any WARL CSR that has hardwired bits?
I'd like to point out another performance pitfall with trap-and-emulate that I've mentioned before but might not be obvious from reading your proposal: the illegal instruction traps triggered by the guest trying to use hypervisor CSRs or run hypervisor instructions will not trap directly to HS-mode. Rather they will be routed to M-mode and then get forwarded to HS-mode, which has about two times higher overhead (forwarding a trap is at least as expensive as emulating most instructions). It is also quite avoidable by adding a bit to let M-mode delegate traps from legal but privileged instructions executed in U/VS/VU modes.
Jonathan
On Tue, Mar 17, 2020 at 6:40 AM Anup Patel via Lists.Riscv.Org <anup.patel=wdc.com@...> wrote:
|
|