Re: [tech-privileged] hypervisor extension: seL4 experience and feedback

Shen, Yanyan (Data61, Kensington NSW) <yanyan.shen@...>

Hi John,

See my responses inline below.


On Thu, 2020-02-27 at 13:54 -0800, John Hauser wrote:
Hi Gernot and Yanyan,

It's been a couple of months since you first sent (Dec. 4) your
document reporting your experience adapting the seL4 microkernel to
draft 0.4 of the RISC-V hypervisor extension, with some questions
the then-current 0.5 draft. I earlier responded in detail to your
feedback from sections 4 and 5 of your document. I'd like to respond
finally to a couple remaining issues raised in sections 6 and 7.
Thanks very much for your installments, which clarify things and help
us to understand the extension.

Q6: How are the two instructions, RDCYCLE and RDINSTRET, treated
by the hypervisor extension? Are they going to return the cycles
consumed and instructions retired by the current running VM only?
Without additional "delta" registers like RDTIME's htimedelta,
the expectation currently is that bits CY and IR in hcounteren for
the cycle and instret counters will normally be set to zero. The
hypervisor thus gets to emulate these counters for the virtual
adjusting the global cycle and instret counts as necessary.
So, it is expected that the instructions return the cycles consumed and
instructions retired by the calling VM. However, it is up to the
hypervisor to decide the accuracy of the values returned.

It's perfectly reasonable to question whether emulating the cycle and
instret counters will be too expensive in practice. The official
for now is that emulation should be tolerable. RDCYCLE and RDINSTRET
are expected to be used only for performance measurements, and should
not be executed too frequently.
I agree that the trap-and-emulate will work, and the performance may be
acceptable if the registers are accessed infrequently.

As Andy already pointed out, the RDINSTRET could be quite useful for
other purposes as well (e.g., record-and-reply or redundant execution).
Would it be possible to add a filter or mask so that user-mode or
kernel-mode retired instructions could be counted separately?

A related question is the accuracy of RDINSTRET. Are over-counting or
under-counting allowed for certain conditions? What is the degree of
freedom an implementation could have to interpret the meaning and
accuracy of the RDINSTRET instruction?

The v0.5 draft states that the accesses to the VS CSRs in VS-mode
cause illegal instructions, so nested virtualization could be built
on trap-and-emulate. Similarly, accesses to HS-mode CSRs from the
second-level hypervisor also need to be trapped and emulated. This
approach naturally raises concerns about the overhead of trapping,
decoding, and handling the CSR accesses. As Arm and x86 already
added hardware support for nested virtualisation, are we
similar hardware support in RISC-V?
Additional optional hardware for nested hypervisors is being
considered. More about this may come out later in 2020 or next year.
Right now, other components that are needed for a server-class RISC-V
platform are probably a higher priority.
Good to know that nested virtualisation is being considered. I
understand there are higher priority tasks.


- John Hauser

