Re: [tech-privileged] hypervisor extension: seL4 experience and feedback
Andy Glew Si5
toggle quoted messageShow quoted text
As Andy already pointed out, the RDINSTRET could be quite useful for other purposes as well (e.g., record-and-reply or redundant execution). Would it be possible to add a filter or mask so that user-mode or kernel-mode retired instructions could be counted separately?
I like the filter/mask idea, as I will explain below, but I think it belongs more to generic performance event counters, not RDINSTRET or RDCYCLE. I think those instructions should do one thing and one thing well. If they can be configured, then it will be harder to use them locally, e.g. in a library, without knowledge of the global setting.
As for filtering of generic performance counters:
x86 EMON has generic filtering:
Therefore you count cache misses, instructions retired, instruction speculatively decoded, etc. etc. user/kernel/hypervisor/any.
Note: “instructions retired” vs “speculative instructions” is not generic, since there are many possible places where one can count speculative instructions. Similarly speculative cache misses.
These are all great things, great for performance analysis. But there are never enough performance counters to count everything in one pass. So they need to be managed globally. Which, as Jack Dennis (static dataflow guy) says “violates software engineering modularity”.
Providing fixed well-characterized definitions of RDCYCLE and RDINSTRET allows at least these events to be used locally, e.g. for usage aware algorithms, within functions and classes. Without having to mess with a global management infrastructure.
See my responses inline below.
On Thu, 2020-02-27 at 13:54 -0800, John Hauser wrote:
> Hi Gernot and Yanyan,
> It's been a couple of months since you first sent (Dec. 4) your
> document reporting your experience adapting the seL4 microkernel to
> draft 0.4 of the RISC-V hypervisor extension, with some questions
> about the then-current 0.5 draft. I earlier responded in detail to
> your feedback from sections 4 and 5 of your document. I'd like to
> respond finally to a couple remaining issues raised in sections 6 and
Thanks very much for your installments, which clarify things and help us to understand the extension.
> > Q6: How are the two instructions, RDCYCLE and RDINSTRET, treated by
> > the hypervisor extension? Are they going to return the cycles
> > consumed and instructions retired by the current running VM only?
> Without additional "delta" registers like RDTIME's htimedelta, the
> expectation currently is that bits CY and IR in hcounteren for the
> cycle and instret counters will normally be set to zero. The
> hypervisor thus gets to emulate these counters for the virtual
> machine, adjusting the global cycle and instret counts as necessary.
So, it is expected that the instructions return the cycles consumed and instructions retired by the calling VM. However, it is up to the hypervisor to decide the accuracy of the values returned.
> It's perfectly reasonable to question whether emulating the cycle and
> instret counters will be too expensive in practice. The official line
> for now is that emulation should be tolerable. RDCYCLE and RDINSTRET
> are expected to be used only for performance measurements, and should
> not be executed too frequently.
I agree that the trap-and-emulate will work, and the performance may be acceptable if the registers are accessed infrequently.
As Andy already pointed out, the RDINSTRET could be quite useful for other purposes as well (e.g., record-and-reply or redundant execution).
Would it be possible to add a filter or mask so that user-mode or kernel-mode retired instructions could be counted separately?
A related question is the accuracy of RDINSTRET. Are over-counting or under-counting allowed for certain conditions? What is the degree of freedom an implementation could have to interpret the meaning and accuracy of the RDINSTRET instruction?
> > The v0.5 draft states that the accesses to the VS CSRs in VS-mode
> > cause illegal instructions, so nested virtualization could be built
> > on trap-and-emulate. Similarly, accesses to HS-mode CSRs from the
> > second-level hypervisor also need to be trapped and emulated. This
> > approach naturally raises concerns about the overhead of trapping,
> > decoding, and handling the CSR accesses. As Arm and x86 already
> > added hardware support for nested virtualisation, are we
> > anticipating similar hardware support in RISC-V?
> Additional optional hardware for nested hypervisors is being
> considered. More about this may come out later in 2020 or next year.
> Right now, other components that are needed for a server-class RISC-V
> platform are probably a higher priority.
Good to know that nested virtualisation is being considered. I understand there are higher priority tasks.
> - John Hauser