Given the discussions about cache-ops and the name for them on tech-cmo and the desire to avoid "co", "COF" (which can also mean "change of flow") may not be the best choice for the extension short-name. What about just "Sshpm", as this extension is what really allows the HPM to be well-utilized by tools like perf? Or is that too confusing since hpm already exists?
I like the concept of putting overflow and filtering control into the mhpmevent registers -- single write to completely configure a counter.
Is there a reason there is no mcountovf? It would simplify the software for an M-mode tool, and for cores that don't have an S-mode.
How is overflow defined for an implementation that implements 32<n<64 bits in the counter registers? Although the registers are architecturally 64 bits, an implementation may not want to support all of them. Mandating full 64-bit counters may make an implementation area-prohibitive for the smallestĀ of perfmon-enabled embedded cores. I think this could be specified like this: "An implementation may implement less than 64 bits for the hpmcounter CSRs. On such an implementation, software can query the bit width of the hmpcounter registers by taking advantage of the WARL behavior: writing all 1's and reading back to see which bits retained the set value. Also, on such implementations, overflow is defined to occur when the highest implemented bit transitions from 1 to 0." Given that, software can do the right thing regardless of implemented bit width.