Re: Proposal: Supervisor Timer CSR and Virtual Supervisor Timer CSR

Greg Favor


Thanks.  That narrows down where we are differing.  

If the hypervisor has set hideleg bit 6 (so that VS-level timer interrupts can be received by VS-mode while V=1), then any new pending VS-level timer interrupt won't be taken until the hypervisor returns back into the VM.  But at that moment the hypervisor is busy servicing whatever caused the exit from the VM and into the hypervisor.  Even if the hypervisor got interrupted by the pending VS-level interrupt, it would defer doing anything about it until it reached a point where it could consider returning to this VM.

Then, at junctures like that, the hypervisor may decide to context switch to another VM instead of returning into this one.  But the hypervisor code that considers whether to switch away to another VM or not would be the natural place for it to check (via hip.VSTIP) whether the current VM has a pending VS-level timer interrupt and decide whether it should not switch to another VM.  This check should not be appearing in many places in hypervisor code.  I believe it would be in the one or few pieces of code that handle deciding whether to return to the current VM or to context switch to a different VM (i.e this decision-making process should not be spread across many places in the code).

Conversely, if you had some form of VGTI, then when the hypervisor receives the VS-level timer interrupt, it would generally have to defer doing something about it until it reaches a suitable juncture where it can consider whether to return to this (or another) VM.  At which point the code at that juncture could simply poll hip.VSTIP instead of having to remember that a VGTI was received earlier.

I'm guessing this may not get us to a point of agreement yet :), but hopefully we're narrowing in.


P.S. If necessary, we can put this to the people doing two of the RISC-V hypervisor ports.

On Thu, Sep 10, 2020 at 8:04 PM zhaosiqi (A) via <> wrote:
Hi Greg,

I understand your concern. In fact, the goal is not to provide more than one vstimecmp CSR.

You are right that when a VM is context switched out, it's vstimecmp CSR is also switched out. However, there exists time when the VM, or a vCPU, is not executing, however, its CSR context is still 'bound' to a hart. In other words, the vs- CSRs still contain values for that vCPU (including vstimecmp), however, the hart is executing hypervisor code. For example, when the hypervisor is handling certain exception caused by the VM such as a guest page fault, there is no need to switch the vCPU context out.

During this time, since vstimecmp still contains the value for that vCPU, it can fire interrupts. When this interrupt is received by the hart, then obviously it should be the hypervisor that handles it, thus the VGTI. Without VGTI, this interrupt will be delayed until the hypervisor returns to the vCPU. This delay can be long because the hypervisor may not immediately return to the vCPU, i.e. it may well decide that the vCPU needs to yield and schedule something else. Of course, the hypervisor can check at strategic points the value in vstimecmp and make proper decision, but that makes software more complex because those checks might be tricky to be inserted.

So VGTI is purely for the prompt handling of the vstimecmp interrupt when the vCPU is still bound to a hart but not executing.

Lastly, when the hypervisor finally decides to switch a vCPU out, then vstimecmp gets saved and hypervisor uses its own timer to track the timer set by that vCPU, which goes to stimecmp. In this way the only vstimecmp is multiplexed among vCPUs, there's no need for more than one vstimecmp.


Join { to automatically receive all group messages.