Re: A proposal to enhance RISC-V HPM (Hardware Performance Monitor)


Brian Grayson
 

Hi, Alan.

My proposal is still a work in progress, hence has not been shared publicly, but is significantly based on a proven architecture with about 30 years in the field and a few billion shipping cores, if not more -- the PowerPC performance monitor implementation. I did the in-house Linux kernel patches and tool support for it about two decades ago at Motorola :) so I used to know it quite well, and can see how a similar approach solves some of the current problems that we all have encountered with the current RISC-V approach. I am fairly new to the RISC-V ecosystem, so I was not aware of the work that you have done in the past; thanks for the pointer to that.

The SBI PMU extensions is more about the API between what perf (or another tool) communicates, and how the M-mode software interprets it, and not about actually changing the hardware interpretation of mhpmevent bits, at least that was my understanding.

I am glad that so many of us are converging on all the same fundamental needs!

Brian

On Mon, Jul 20, 2020 at 7:38 PM alankao <alankao@...> wrote:
Hi Brian,

I have been working on a similar proposal myself, with overflow, interrupts, masking, and delegation. One of the key differences in my proposal is that it unifies each counter's configuration control into a per-counter register, by using mhpmevent* but with some fields reserved/assigned a meaning.  <elaborating>
Thanks for sharing your experience and the elaboration. The overloading-hpmevent idea looks like the one in the SBI PMU extension threads in Unix Platform Spec TG by Greg. I have a bunch of questions.  How was your proposal later? Was it discussed in public? Did you manage to implement your idea into a working HW/S-mode SW/U-mode SW solution? If so, we can compete with each other by real benchmarking the LoC of the perf patch (assuming you do it on Linux) and the system overhead running a long perf sample.


Another potential discussion point is, does overflow happen at 0x7fffffffffffffff -> 0x8000000000000000, or at 0xffffffffffffffff -> 0x0000000000000000? I have a bias towards the former so that even after overflow, the count is wholly contained in an XLEN-wide register treated as an unsigned number and accessible via a single read, which makes arithmetic convenient, but I know some people prefer to (or are used to?) have the overflow bit as a 33rd or 65th bit in a different register.
I have no bias here as long as the HPM interrupt can be triggered. But somehow it seems to me that you assume the HPM registers are XLEN-width but actually they are not (yet?).  The spec says they should be 64-bit width although obviously nobody implements nor remember that.

Lastly, a feature I have enjoyed using in the past (on another ISA) is the concept of a 'marked' bit in the mstatus register. ... This is of course a bit intrusive in the architecture, as it requires adding a bit to mstatus, but the rest of the kernel just needs to save and restore this bit on context switches, without knowing its purpose.
Which architecture/OS are you referring to here? 

Through this discussion, we will understand which idea is the community prefer to: adding CSRs, overloading existing hpmevents, or any balanced compromise.  I believe the ultimate goal of this thread should be determining what the RISC-V HPM should really be like.

Best,
Alan

Join {tech-privileged@lists.riscv.org to automatically receive all group messages.