Re: Proposal: SBI PMU Extension


Greg Favor
 

Anup,

Just to clarify a couple of things as to where I was coming from (which I think would mirror other people that also want to have more than just an event selector field in the hpmevent CSR's):

We want to avoid and plan on avoiding having any perf mon related custom (aka implementation-specific) CSR's.  The existing hpmeventX CSR's, especially in RV64, each provide a lot of space for putting whatever custom per-counter controls/etc. that one might want to have.  Hence no need to have separate custom CSR's.

Our thought would be for a 16-bit event_idx.code field to go into hpmeventX[15:0] and the 3-bit event_idx.type field to go into hpmeventX[18:16].  About as straightforward and simple as one could have.

Then hpmeventX[63:19] are free for whatever other custom stuff that an implementation might want to have.  In our case we prefer to keep everything within the low 32 CSR bits - which still leaves 13 bits for most all of what we want to have (and what others might want to have).

It sounds like you aren't trying to say anything in the API about where event_idx ends up within an hpmeventX CSR, but it would be good to keep the size of the event_idx value bounded - like you originally did (but to 19 bits instead of 15 bits).  And have event_info[44:0] (or just provide a full 64-bit event_info parameter).

Then event_idx and event_info can be concatenated - as {event_info:44:0], event_idx{18:0]} - to create the 64-bit value to write into an hpmeventX CSR.  (Or {event_info:12:0], event_idx{18:0]} - to create the 32-bit value for an RV32 hpmeventX CSR.)  Not that you would require that packing arrangement (although it might encourage a standard hpmeventX format for future implementations).  And this allows full flexibility for people to stick whatever implementation-specific stuff they want into the upper hpmeventX bits.

Keeping event_idx to 16 bits would also contain the max size of the list/array that  SBI_PMU_COUNTER_DESCRIBE would use.  Or you can go back to bitmaps for each event_idx.type.  type==2 would require up to 4096 bytes.  type==0 and type==1 would only require small bitmaps.

Greg

On Wed, Jul 1, 2020 at 9:54 PM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

Thanks for the feedbacks and requests.

 

I had kept event_idx to be just 15bits so that all possible events can be represented by a bitmap of 2048 bytes. I certainly see the advantage of allowing bigger values for event_idx. I will update SBI_PMU_COUNTER_DESCRIBE call to use a list/array instead of bitmap which will allow us to have 64bit event_idx.

 

Regarding the event_info suggestion, I will extend SBI_PMU_COUNTER_START call to have one more parameter for event_info. Previously, I did not consider RISC-V implementation having HW filtering of events but this is a good feature to have. Thanks for suggesting.

 

Also, over here we are only proposing SBI calls which M-mode software (OpenSBI) and Hypervisors (KVM/Xvisor) can implement. Your RISC-V implementation can have implementation specific CSRs (hpmevent or hpmfilter, etc) for configuring HPMCOUNTER.

 

Regards,

Anup

 

From: tech-unixplatformspec@... <tech-unixplatformspec@...> On Behalf Of Greg Favor
Sent: 02 July 2020 00:04
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

This is great to see - as part of standardizing how RISC-V HPM counters are configured and controlled by softare

 

I have a modest but important request:  Increase the size of the event_idx 'code' field from event_idx[11:0] to event_idx[15:0].  This is for two reasons:

 

- As with the size of the event_idx 'type' field, this allows a good amount of space for future growth, especially as more architecture extensions come along and motivate having additional events (starting with the vector, hypervisor, and bitmanip extensions).

 

- This allows space and flexibility for things like having "structured" events - meaning events with event-specific filter bits.  This would be applicable when event_idx.type==2 (aka RAW type events).  In our implementation, for example, part of the 'code' field would specify a particular type of event and another part of the 'code' field would specify filter bits to provide the flexibility in only counting selected sub-categories of that type of event.

 

Secondly, this proposal seems to only provide event_idx as information to be written into an hpmevent CSR (in the case of hardware counters)?  It would be desirable to have another parameter (e.g. event_info) that can be passed through this API to the eventual hpmevent CSR write.  One could imagine event_idx and event_info being concatenated to create what is written into a 32b or 64-bit hpmeventX CSR.

 

For example, in RV64, this could result in writing the 64-bit value {event_info[43:0], event_idx[19:0]} into hpmeventX  (assuming the above increase in event_idx size).  This provides a standard way for software to configure an entire hpmevent CSR.

 

Greg

 

On Wed, Jul 1, 2020 at 5:26 AM Anup Patel <anup.patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HW performance
counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V
CPU can allow monitoring HW events using few HPMCOUNTER CSRs. The M-mode
software can also inhibit unused performance counters to save energy.

In addition to HW performance counters, a SBI implementation (e.g. OpenSBI,
Xvisor, KVM, etc) can provide software counters for interesting events
such as number of RFENCEs, number of IPIs, number of misaligned load/store
instructions, number of illegal instructions, etc.

We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET CSR,
HPMCOUNTER CSRs and software counters of the SBI implementation.

To define SBI PMU extension, we first define counter_idx which is a unique
number assigned to a counter and event_idx which is an encoded number
representing event to be monitored.

The SBI PMU event_idx is 15bit number encoded as follows:
event_idx[14:12] = type
event_idx[11:0] = code

If event_idx.type == 0 then it is HARDWARE event and event_idx.code can
be one of the following:
enum sbi_pmu_hw_id {
    /*
     * Common hardware events, generalized by the kernel:
     */
    PERF_COUNT_HW_CPU_CYCLES        = 0,
    PERF_COUNT_HW_INSTRUCTIONS        = 1,
    PERF_COUNT_HW_CACHE_REFERENCES        = 2,
    PERF_COUNT_HW_CACHE_MISSES        = 3,
    PERF_COUNT_HW_BRANCH_INSTRUCTIONS    = 4,
    PERF_COUNT_HW_BRANCH_MISSES        = 5,
    PERF_COUNT_HW_BUS_CYCLES        = 6,
    PERF_COUNT_HW_STALLED_CYCLES_FRONTEND    = 7,
    PERF_COUNT_HW_STALLED_CYCLES_BACKEND    = 8,
    PERF_COUNT_HW_REF_CPU_CYCLES        = 9,

    PERF_COUNT_HW_MAX,            /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 1 then it is CACHE event and event_idx.code is encoded
as follows:
event_idx.code[11:4] = cache_id
event_idx.code[3:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    PERF_COUNT_HW_CACHE_L1D            = 0,
    PERF_COUNT_HW_CACHE_L1I            = 1,
    PERF_COUNT_HW_CACHE_LL            = 2,
    PERF_COUNT_HW_CACHE_DTLB        = 3,
    PERF_COUNT_HW_CACHE_ITLB        = 4,
    PERF_COUNT_HW_CACHE_BPU            = 5,
    PERF_COUNT_HW_CACHE_NODE        = 6,

    PERF_COUNT_HW_CACHE_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    PERF_COUNT_HW_CACHE_OP_READ        = 0,
    PERF_COUNT_HW_CACHE_OP_WRITE        = 1,
    PERF_COUNT_HW_CACHE_OP_PREFETCH        = 2,

    PERF_COUNT_HW_CACHE_OP_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    PERF_COUNT_HW_CACHE_RESULT_ACCESS    = 0,
    PERF_COUNT_HW_CACHE_RESULT_MISS        = 1,

    PERF_COUNT_HW_CACHE_RESULT_MAX,        /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 2 then it is RAW event and event_idx.code is just
a RAW event number.

In future, more event_idx can be defined without breaking ABI compatibility
of SBI calls.

Based on above definition of counter_idx definition, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_DESCRIBE
   This call takes two parameters: 1) counter_idx 2) physical address of 4k page
   It will write the description of SBI PMU counter at specified physical
   address. The details of the SBI PMU counter written at specified physical
   address are as follows:
   1. Name (64 bytes)
   2. CSR_Offset (4 bytes)
      (E.g. CSR_Offset == 0x2 imply CSR 0xC02)
      (E.g. CSR_Offset == 0xffffffff means it is SBI implementation counter)
   3. CSR_Width (4 bytes)
      (Number of CSR bits implemented in HW)
   4. Event bitmap (2048 bytes) (i.e. 1-bit for each possible event_idx)
      (If bit corresponding to a event_idx is 1 then event_idx is supported
       by the counter)
   5. Anything else ??
3. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters: 1) counter_idx 2) physical address
   It will set the physical address where SBI implementation will write
   the software counter. This SBI call is only for counters not mapped
   to any CSR (i.e. only for counters with CSR_Offset == 0xffffffff).
4. SBI_PMU_COUNTER_START
   This call takes two parameters: 1) counter_idx 2) event_idx
   It will inform SBI implementation to configure and start/enable specified
   counter on the calling HART to monitor specific event. This SBI call will
   fail for counters which are not present.
5. SBI_PMU_COUNTER_STOP
   This call takes one parameter: 1) counter_idx
   It will inform SBI implementation to stop/disable specified counters on the
   calling HART. This SBI call will fail for counters which are not present.

From above, the RISC-V PMU driver will use most of the SBI calls at boot time.
Only SBI_PMU_COUNTER_START to be used once before using the counter. The reading
the counter is by reading CSR (for CSR_Offset != 0xffffffff) OR by reading
memory location (for CSR_Offset == 0xffffffff). The counter overflow handling
will have to be done in software by Linux kernel.

The information returned by SBI_PMU_NUM_COUNTERS and SBI_PMU_COUNTER_DESCRIBE
can be passed via DT/ACPI but it will be difficult to maintain because we
have hardware counters and SBI implementation counters both provided by SBI
PMU extension. The SBI implementation counters are specific to underlying
SBI implementation so we will have to keep counters/events described in
DT/ACPI in-sync with underlying SBI implementation.

Regards,
Anup


Join tech-unixplatformspec@lists.riscv.org to automatically receive all group messages.