Re: A proposal to enhance RISC-V HPM (Hardware Performance Monitor)


Hi Anup,

> The most desired feature from a PMU is counting events in right-context (or right-mode). This is not clearly defined in RISC-V spec right now. Greg’s proposal already address this in a clean way by defining required bits in MHPMEVENT CSRs.  Other important feature expected from a PMU is reading counters without traps, this is already available because HPMCOUNTER CSRs are “User-Read-Only”

Claiming some features as "most desired" is too subjective.  I agree that mode-specific counting is important, but for performance monitoring, the HPM interrupt is also essential.  Otherwise, sampling like `perf record` just doesn't work.

> Regarding HPMCOUNTER writes from S-mode, the Linux PMU drivers (across architectures) only writes PMU counter at time of configuring the counter. We anyway have SBI call to configure a RISC-V PMU counter because MHPMEVENT CSR is for M-mode only so it is better to initialize/write the MHPMCOUNTER CSR in M-mode at time of configuring the counter. Allowing S-mode to write HPMCOUNTER CSR is good but won’t benefit much. On the contrary, RISC-V hypervisors might end up save/restore more CSRs if HPMCOUTNER CSR is writeable from S-mode.

You are understating the case when you say "only writes counter at configuration time."  It sounds like the kernel seldom writes them.  The fact is, the counters and other registers need writing every time the corresponding process is being context-switch in and every time an HPM interrupt is being handled.  Would you like to elaborate more, based on what do you say writing counters from S-mode is good? based on what do you judge it won't benefit much?  

However, since RISC-V doesn't have any discussed features yet, I doubt that anyone has any quantitative data.  My proposal here is that I will take our existing solution (AndeStar V5 extension and perf-event port on Linux 4.17) as the testbed.  By default, we have *mcounterwen*, which is effectively equal to the bit[28] in Greg's proposal, to enable S-mode writing HPM CSRs. This is the treatment group.  Then we do a patch to transform all existing csr_write's to HPM CSRs (including counters) into SBI calls as the control group.  My anticipation of the result is that the wall clock time performing a sampling in the treatment group will be not just marginally shorter than the control group.

Meanwhile, I agree with your concern about H extension.  That's why I emphasized this feature is useful for M-S-U configuration and questionable for M-H-S-U one.

> The code snippet mentioned below requires “#ifdef” which means we have to build Linux RISC-V image differently for doing CSR writes this way. This approach is not going to fly for distros because distros can’t release single Linux RISC-V image for all RISC-V hardware if we have such “#ifdef”.

Each distro maintains its own priority of hundreds of thousands of kernel features, not to mention many nameless "distributions" released by different teams as their BSPs do the same thing.  The diversity of features is the reason that so many distributions rise and fall, compete and cooperate.  Therefore, what we should debate is not what distros that support RISC-V should do with this possible divergence: I am totally fine that this CONFIG_RESTRICT_MREG_ACESS is off by default!  Big ones like Fedora and Debian aim at Desktop or Server, and that's good.  What we should really debate here is the feature itself, if it is useful enough for some, not all, possible RISC-V machines that help to make people's lives easier.  

For the record, distributions can just release a single image that disables this feature by default.  That's their choice because they expect quite a ratio of it will run as a guest OS, and it should not enable the feature or there will be a lot more work in the hypervisor.  The image can still run on any RISC-V machines that either or not support bit[28]/mcounterwen.  You are not making a valid point here.


Join { to automatically receive all group messages.