Gernot wrote:
User-level drivers are a core properties of (well-designed)
microkernels, and microkernels are pretty much the only choice for
safety- and security-critical systems, and the only kind of OS that
is feasible to prove correct (see seL4).
And if you run drivers in user mode, then being able to route
interrupts directly to the user-level handlers without invoking the
kernel would seem to pretty much eliminate the performance
disadvantage microkernels have compared to Linux. In fact, my
immediate reaction seeing this extension was “yeah!”.
Aside from the fact that doing this requires some version of the
quasi-deprecated N extension be implemented in addition to the
hypervisor extension, the main problem with this idea is how the
hardware decides to send memory access traps to U mode versus HS mode.
The delegation provided by sideleg is too crude to suffice for this
purpose. Instead, the choice must be encoded on a per-page basis
in the G-stage page tables---which is what Huawei's proposal does,
naturally enough.
But if our real goal is for virtual machines to run as fast as
possible, it seems to me the more important subgoal is to minimize
the number of times when memory accesses to a virtual device must be
trapped and emulated, by maximizing the opportunity for a guest OS
to directly control physical devices without emulation. The hardware
components needed for this include the new interrupt architecture that
is being developed, plus a sufficiently capable IOMMU, a proposal for
which is being drafted by a different informal group of interested
parties.
Even with this new hardware (in whatever form it actually becomes
standard for RISC-V), we can expect that some need to trap-and-emulate
for virtual devices will remain. But at the current time, I don't know
how we can predict very well the performance cost of those traps that
remain, or how much improvement we would get from adopting Huawei's
proposal. For all I know, it may be that, once we have the new
interrupt architecture and an IOMMU, the added improvement from
Huawei's proposal is barely noticeable.
Maybe it will be, and maybe it won't. All I'm suggesting is, it would
be better to evaluate that choice after these other essential pieces
are in place, and after all the relevant software has been completed
and optimized, as Anup Patel spoke of.
One counterargument might be to claim that there is sufficient long-
term market interest in supporting the hypervisor extension as best
as possible _without_ the new interrupt architecture and IOMMU. I'll
leave it to others to try to make that case.
- John Hauser