Date   

Re: Proposal: SBI PMU Extension

Anup Patel
 

Hi Greg,

 

Thanks for the feedbacks and requests.

 

I had kept event_idx to be just 15bits so that all possible events can be represented by a bitmap of 2048 bytes. I certainly see the advantage of allowing bigger values for event_idx. I will update SBI_PMU_COUNTER_DESCRIBE call to use a list/array instead of bitmap which will allow us to have 64bit event_idx.

 

Regarding the event_info suggestion, I will extend SBI_PMU_COUNTER_START call to have one more parameter for event_info. Previously, I did not consider RISC-V implementation having HW filtering of events but this is a good feature to have. Thanks for suggesting.

 

Also, over here we are only proposing SBI calls which M-mode software (OpenSBI) and Hypervisors (KVM/Xvisor) can implement. Your RISC-V implementation can have implementation specific CSRs (hpmevent or hpmfilter, etc) for configuring HPMCOUNTER.

 

Regards,

Anup

 

From: tech-unixplatformspec@... <tech-unixplatformspec@...> On Behalf Of Greg Favor
Sent: 02 July 2020 00:04
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

This is great to see - as part of standardizing how RISC-V HPM counters are configured and controlled by softare

 

I have a modest but important request:  Increase the size of the event_idx 'code' field from event_idx[11:0] to event_idx[15:0].  This is for two reasons:

 

- As with the size of the event_idx 'type' field, this allows a good amount of space for future growth, especially as more architecture extensions come along and motivate having additional events (starting with the vector, hypervisor, and bitmanip extensions).

 

- This allows space and flexibility for things like having "structured" events - meaning events with event-specific filter bits.  This would be applicable when event_idx.type==2 (aka RAW type events).  In our implementation, for example, part of the 'code' field would specify a particular type of event and another part of the 'code' field would specify filter bits to provide the flexibility in only counting selected sub-categories of that type of event.

 

Secondly, this proposal seems to only provide event_idx as information to be written into an hpmevent CSR (in the case of hardware counters)?  It would be desirable to have another parameter (e.g. event_info) that can be passed through this API to the eventual hpmevent CSR write.  One could imagine event_idx and event_info being concatenated to create what is written into a 32b or 64-bit hpmeventX CSR.

 

For example, in RV64, this could result in writing the 64-bit value {event_info[43:0], event_idx[19:0]} into hpmeventX  (assuming the above increase in event_idx size).  This provides a standard way for software to configure an entire hpmevent CSR.

 

Greg

 

On Wed, Jul 1, 2020 at 5:26 AM Anup Patel <anup.patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HW performance
counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V
CPU can allow monitoring HW events using few HPMCOUNTER CSRs. The M-mode
software can also inhibit unused performance counters to save energy.

In addition to HW performance counters, a SBI implementation (e.g. OpenSBI,
Xvisor, KVM, etc) can provide software counters for interesting events
such as number of RFENCEs, number of IPIs, number of misaligned load/store
instructions, number of illegal instructions, etc.

We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET CSR,
HPMCOUNTER CSRs and software counters of the SBI implementation.

To define SBI PMU extension, we first define counter_idx which is a unique
number assigned to a counter and event_idx which is an encoded number
representing event to be monitored.

The SBI PMU event_idx is 15bit number encoded as follows:
event_idx[14:12] = type
event_idx[11:0] = code

If event_idx.type == 0 then it is HARDWARE event and event_idx.code can
be one of the following:
enum sbi_pmu_hw_id {
    /*
     * Common hardware events, generalized by the kernel:
     */
    PERF_COUNT_HW_CPU_CYCLES        = 0,
    PERF_COUNT_HW_INSTRUCTIONS        = 1,
    PERF_COUNT_HW_CACHE_REFERENCES        = 2,
    PERF_COUNT_HW_CACHE_MISSES        = 3,
    PERF_COUNT_HW_BRANCH_INSTRUCTIONS    = 4,
    PERF_COUNT_HW_BRANCH_MISSES        = 5,
    PERF_COUNT_HW_BUS_CYCLES        = 6,
    PERF_COUNT_HW_STALLED_CYCLES_FRONTEND    = 7,
    PERF_COUNT_HW_STALLED_CYCLES_BACKEND    = 8,
    PERF_COUNT_HW_REF_CPU_CYCLES        = 9,

    PERF_COUNT_HW_MAX,            /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 1 then it is CACHE event and event_idx.code is encoded
as follows:
event_idx.code[11:4] = cache_id
event_idx.code[3:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    PERF_COUNT_HW_CACHE_L1D            = 0,
    PERF_COUNT_HW_CACHE_L1I            = 1,
    PERF_COUNT_HW_CACHE_LL            = 2,
    PERF_COUNT_HW_CACHE_DTLB        = 3,
    PERF_COUNT_HW_CACHE_ITLB        = 4,
    PERF_COUNT_HW_CACHE_BPU            = 5,
    PERF_COUNT_HW_CACHE_NODE        = 6,

    PERF_COUNT_HW_CACHE_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    PERF_COUNT_HW_CACHE_OP_READ        = 0,
    PERF_COUNT_HW_CACHE_OP_WRITE        = 1,
    PERF_COUNT_HW_CACHE_OP_PREFETCH        = 2,

    PERF_COUNT_HW_CACHE_OP_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    PERF_COUNT_HW_CACHE_RESULT_ACCESS    = 0,
    PERF_COUNT_HW_CACHE_RESULT_MISS        = 1,

    PERF_COUNT_HW_CACHE_RESULT_MAX,        /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 2 then it is RAW event and event_idx.code is just
a RAW event number.

In future, more event_idx can be defined without breaking ABI compatibility
of SBI calls.

Based on above definition of counter_idx definition, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_DESCRIBE
   This call takes two parameters: 1) counter_idx 2) physical address of 4k page
   It will write the description of SBI PMU counter at specified physical
   address. The details of the SBI PMU counter written at specified physical
   address are as follows:
   1. Name (64 bytes)
   2. CSR_Offset (4 bytes)
      (E.g. CSR_Offset == 0x2 imply CSR 0xC02)
      (E.g. CSR_Offset == 0xffffffff means it is SBI implementation counter)
   3. CSR_Width (4 bytes)
      (Number of CSR bits implemented in HW)
   4. Event bitmap (2048 bytes) (i.e. 1-bit for each possible event_idx)
      (If bit corresponding to a event_idx is 1 then event_idx is supported
       by the counter)
   5. Anything else ??
3. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters: 1) counter_idx 2) physical address
   It will set the physical address where SBI implementation will write
   the software counter. This SBI call is only for counters not mapped
   to any CSR (i.e. only for counters with CSR_Offset == 0xffffffff).
4. SBI_PMU_COUNTER_START
   This call takes two parameters: 1) counter_idx 2) event_idx
   It will inform SBI implementation to configure and start/enable specified
   counter on the calling HART to monitor specific event. This SBI call will
   fail for counters which are not present.
5. SBI_PMU_COUNTER_STOP
   This call takes one parameter: 1) counter_idx
   It will inform SBI implementation to stop/disable specified counters on the
   calling HART. This SBI call will fail for counters which are not present.

From above, the RISC-V PMU driver will use most of the SBI calls at boot time.
Only SBI_PMU_COUNTER_START to be used once before using the counter. The reading
the counter is by reading CSR (for CSR_Offset != 0xffffffff) OR by reading
memory location (for CSR_Offset == 0xffffffff). The counter overflow handling
will have to be done in software by Linux kernel.

The information returned by SBI_PMU_NUM_COUNTERS and SBI_PMU_COUNTER_DESCRIBE
can be passed via DT/ACPI but it will be difficult to maintain because we
have hardware counters and SBI implementation counters both provided by SBI
PMU extension. The SBI implementation counters are specific to underlying
SBI implementation so we will have to keep counters/events described in
DT/ACPI in-sync with underlying SBI implementation.

Regards,
Anup



Re: Proposal: SBI PMU Extension

Zong Li
 

On Thu, Jul 2, 2020 at 12:38 PM Anup Patel <anup.patel@...> wrote:



-----Original Message-----
From: tech-unixplatformspec@... <tech-
unixplatformspec@...> On Behalf Of Zong Li
Sent: 01 July 2020 20:32
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman
<andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

On Wed, Jul 1, 2020 at 8:26 PM Anup Patel <anup.patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HW
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER
CSRs. A RISC-V CPU can allow monitoring HW events using few
HPMCOUNTER
CSRs. The M-mode software can also inhibit unused performance counters
to save energy.

Do we need the SBI calls to set the mcounteren and mcountinhibit (optional
CSR)?
OR these two CSRs shouldn't be changed at runtime from s-moe?
The SBI_PMU_COUNTER_START call will set/clear appropriate bits in
MCOUNTEREN and MCOUNTINHIBIT CSRs. The SBI_PMU_COUNTER_STOP
will do the reverse for MCOUNTEREN and MCOUNTINHIBIT CSRs.

It is also possible that a RISC-V implementation has few HPMCOUNTER
CSRs but lot of HW events to be monitored. In this case, the RISC-V
implementation will have implementation specific CSR to select a
particular HW event to be monitored in HPMCOUNTER. The SBI implementation
(i.e. OpenSBI) will provide optional platform hooks which will be
called for SBI_PMU_COUNTER_START and SBI_PMU_COUNTER_STOP calls.



In addition to HW performance counters, a SBI implementation (e.g.
OpenSBI, Xvisor, KVM, etc) can provide software counters for
interesting events such as number of RFENCEs, number of IPIs, number
of misaligned load/store instructions, number of illegal instructions, etc.
I'm not sure whether I misunderstood the usage of software counter, I don't
see the event_idxs of these events for software counters, maybe we could
define the event_idxs for them in this proposal, for example, if
event_idx.type == 3, then it is SOFTWARE event, and event_idx.code is used
to specify the monitoring events such as number of RFENCEs, number of IPIs
and so on.
My bad, I forgot to event_idx for SBI implementation events. I will update
in next version. Thanks for catching.



We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET
CSR, HPMCOUNTER CSRs and software counters of the SBI
implementation.

To define SBI PMU extension, we first define counter_idx which is a
unique number assigned to a counter and event_idx which is an encoded
number representing event to be monitored.

The SBI PMU event_idx is 15bit number encoded as follows:
event_idx[14:12] = type
event_idx[11:0] = code

If event_idx.type == 0 then it is HARDWARE event and event_idx.code
can be one of the following:
enum sbi_pmu_hw_id {
/*
* Common hardware events, generalized by the kernel:
*/
PERF_COUNT_HW_CPU_CYCLES = 0,
PERF_COUNT_HW_INSTRUCTIONS = 1,
PERF_COUNT_HW_CACHE_REFERENCES = 2,
PERF_COUNT_HW_CACHE_MISSES = 3,
PERF_COUNT_HW_BRANCH_INSTRUCTIONS = 4,
PERF_COUNT_HW_BRANCH_MISSES = 5,
PERF_COUNT_HW_BUS_CYCLES = 6,
PERF_COUNT_HW_STALLED_CYCLES_FRONTEND = 7,
PERF_COUNT_HW_STALLED_CYCLES_BACKEND = 8,
PERF_COUNT_HW_REF_CPU_CYCLES = 9,

PERF_COUNT_HW_MAX, /* non-ABI */
};
(NOTE: Same as described in
<linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 1 then it is CACHE event and event_idx.code is
encoded as follows:
event_idx.code[11:4] = cache_id
event_idx.code[3:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
PERF_COUNT_HW_CACHE_L1D = 0,
PERF_COUNT_HW_CACHE_L1I = 1,
PERF_COUNT_HW_CACHE_LL = 2,
PERF_COUNT_HW_CACHE_DTLB = 3,
PERF_COUNT_HW_CACHE_ITLB = 4,
PERF_COUNT_HW_CACHE_BPU = 5,
PERF_COUNT_HW_CACHE_NODE = 6,

PERF_COUNT_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
PERF_COUNT_HW_CACHE_OP_READ = 0,
PERF_COUNT_HW_CACHE_OP_WRITE = 1,
PERF_COUNT_HW_CACHE_OP_PREFETCH = 2,

PERF_COUNT_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
PERF_COUNT_HW_CACHE_RESULT_ACCESS = 0,
PERF_COUNT_HW_CACHE_RESULT_MISS = 1,

PERF_COUNT_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as described in
<linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 2 then it is RAW event and event_idx.code is just
a RAW event number.

In future, more event_idx can be defined without breaking ABI
compatibility of SBI calls.

Based on above definition of counter_idx definition, we can
potentially have the following SBI calls:
I don't see the definition of counter_idx, but I remember that you had talked
about that in other places as follows:
1. counter_idx = 0 to 2 are for CYCLE, TIME, and INSTRET 2. counter_idx = 3 to
31 are for HPMCOUNTERs 3. counter_idx >= 32 are for software counters
Initially, I had tied counter_idx with CSR numbers but TIME CSR will need
to be handled as special case.

It's better to treat counter_idx as logical index of available counter this
also helps reduce SBI calls.



1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs 2.
SBI_PMU_COUNTER_DESCRIBE
This call takes two parameters: 1) counter_idx 2) physical address of 4k
page
It will write the description of SBI PMU counter at specified physical
address. The details of the SBI PMU counter written at specified physical
address are as follows:
1. Name (64 bytes)
2. CSR_Offset (4 bytes)
(E.g. CSR_Offset == 0x2 imply CSR 0xC02)
(E.g. CSR_Offset == 0xffffffff means it is SBI implementation
counter)
Maybe it would be more clear if we use counter_idx instead of CSR_Offset?
See my previous comment.

I agree CSR_offset is little confusing. Let's have CSR_number instead
of CSR_offset. This way we can even use RISC-V implementation specific
CSR (i.e. non-HPMCOUNTER CSR) as counter. All counters with
CSR_number > 0xfff will be treated as SBI implementation counter.


3. CSR_Width (4 bytes)
(Number of CSR bits implemented in HW)
4. Event bitmap (2048 bytes) (i.e. 1-bit for each possible event_idx)
(If bit corresponding to a event_idx is 1 then event_idx is supported
by the counter)
Is there more detail about the corresponding bit of events? for example, the
bit 0 corresponds to event_idx 0x0, and bit 10 corresponds to event_idx
0x1000.
It's a bitmap representing all possible event_idx values. If bit X is set then
it means event_idx = X can be monitored by this counter.


5. Anything else ??
3. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters: 1) counter_idx 2) physical address
It will set the physical address where SBI implementation will write
the software counter. This SBI call is only for counters not mapped
to any CSR (i.e. only for counters with CSR_Offset == 0xffffffff).
4. SBI_PMU_COUNTER_START
This call takes two parameters: 1) counter_idx 2) event_idx
It will inform SBI implementation to configure and start/enable specified
counter on the calling HART to monitor specific event. This SBI call will
fail for counters which are not present.
Just want to make sure whether SBI_PMU_COUNTER_START is for hardware
and software counters? if so, we should define the event_idx for software
counters as I mentioned above.
Yes, it's for both HW and SW counters.

For HW counters (i.e. HPMCOUTNER CSRs), the SBI_PMU_COUNTER_START call will:
1. It will enable access to CSR using MCOUNTEREN CSR (or HCOUNTEREN CSR for hypervisor)
2. It will disable inhibit using MCOUNTINHIT CSR
3. Do any platform specific event selection for the specified HW counter


5. SBI_PMU_COUNTER_STOP
This call takes one parameter: 1) counter_idx
It will inform SBI implementation to stop/disable specified counters on
the
calling HART. This SBI call will fail for counters which are not present.

From above, the RISC-V PMU driver will use most of the SBI calls at boot
time.
Only SBI_PMU_COUNTER_START to be used once before using the
counter.
The reading the counter is by reading CSR (for CSR_Offset !=
0xffffffff) OR by reading memory location (for CSR_Offset ==
0xffffffff). The counter overflow handling will have to be done in software
by Linux kernel.

The information returned by SBI_PMU_NUM_COUNTERS and
SBI_PMU_COUNTER_DESCRIBE can be passed via DT/ACPI but it will be
difficult to maintain because we have hardware counters and SBI
implementation counters both provided by SBI PMU extension. The SBI
implementation counters are specific to underlying SBI implementation
so we will have to keep counters/events described in DT/ACPI in-sync with
underlying SBI implementation.

I have a proposal for DT format of PMU, it seems to me that we need to add
some information for software counters, such as the number of software
counters and it's events, are there any ideas?
With the SBI_PMU_COUNTER_DESCRIBE call, we don't need to pass this
information in DT.

I think the DT format you had proposed seems to have following limitations:
1. It maps each counter to a particular HW event. In reality, we will have few
HW counters and lots of HW events and a HW counter can be configured to
monitor a particular event from a set of HW events. In other words, relation
between HW counter and HW event is one-to-many and not one-to-one.
2. It does not deal with implementation specific HW events.
Sorry for unclear description in DT binding file. The event mapping of property
maps generic HW events of kernel to particular HW events of platform, doesn't
map to specific counters, so it doesn't have the limitations above.

I thought that OpenSBI would fill the content of SBI_PMU_COUNTER_DESCRIBE
by parsing DT file. So actually, does the information be defined
statically by each
firmware in OpenSBI?

Both above limitations, have been taken care in SBI_PMU_COUNTER_DESCRIBE
call.

Regards,
Anup



Re: Proposal: SBI PMU Extension

Greg Favor
 

Anup,

Just to clarify a couple of things as to where I was coming from (which I think would mirror other people that also want to have more than just an event selector field in the hpmevent CSR's):

We want to avoid and plan on avoiding having any perf mon related custom (aka implementation-specific) CSR's.  The existing hpmeventX CSR's, especially in RV64, each provide a lot of space for putting whatever custom per-counter controls/etc. that one might want to have.  Hence no need to have separate custom CSR's.

Our thought would be for a 16-bit event_idx.code field to go into hpmeventX[15:0] and the 3-bit event_idx.type field to go into hpmeventX[18:16].  About as straightforward and simple as one could have.

Then hpmeventX[63:19] are free for whatever other custom stuff that an implementation might want to have.  In our case we prefer to keep everything within the low 32 CSR bits - which still leaves 13 bits for most all of what we want to have (and what others might want to have).

It sounds like you aren't trying to say anything in the API about where event_idx ends up within an hpmeventX CSR, but it would be good to keep the size of the event_idx value bounded - like you originally did (but to 19 bits instead of 15 bits).  And have event_info[44:0] (or just provide a full 64-bit event_info parameter).

Then event_idx and event_info can be concatenated - as {event_info:44:0], event_idx{18:0]} - to create the 64-bit value to write into an hpmeventX CSR.  (Or {event_info:12:0], event_idx{18:0]} - to create the 32-bit value for an RV32 hpmeventX CSR.)  Not that you would require that packing arrangement (although it might encourage a standard hpmeventX format for future implementations).  And this allows full flexibility for people to stick whatever implementation-specific stuff they want into the upper hpmeventX bits.

Keeping event_idx to 16 bits would also contain the max size of the list/array that  SBI_PMU_COUNTER_DESCRIBE would use.  Or you can go back to bitmaps for each event_idx.type.  type==2 would require up to 4096 bytes.  type==0 and type==1 would only require small bitmaps.

Greg

On Wed, Jul 1, 2020 at 9:54 PM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

Thanks for the feedbacks and requests.

 

I had kept event_idx to be just 15bits so that all possible events can be represented by a bitmap of 2048 bytes. I certainly see the advantage of allowing bigger values for event_idx. I will update SBI_PMU_COUNTER_DESCRIBE call to use a list/array instead of bitmap which will allow us to have 64bit event_idx.

 

Regarding the event_info suggestion, I will extend SBI_PMU_COUNTER_START call to have one more parameter for event_info. Previously, I did not consider RISC-V implementation having HW filtering of events but this is a good feature to have. Thanks for suggesting.

 

Also, over here we are only proposing SBI calls which M-mode software (OpenSBI) and Hypervisors (KVM/Xvisor) can implement. Your RISC-V implementation can have implementation specific CSRs (hpmevent or hpmfilter, etc) for configuring HPMCOUNTER.

 

Regards,

Anup

 

From: tech-unixplatformspec@... <tech-unixplatformspec@...> On Behalf Of Greg Favor
Sent: 02 July 2020 00:04
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

This is great to see - as part of standardizing how RISC-V HPM counters are configured and controlled by softare

 

I have a modest but important request:  Increase the size of the event_idx 'code' field from event_idx[11:0] to event_idx[15:0].  This is for two reasons:

 

- As with the size of the event_idx 'type' field, this allows a good amount of space for future growth, especially as more architecture extensions come along and motivate having additional events (starting with the vector, hypervisor, and bitmanip extensions).

 

- This allows space and flexibility for things like having "structured" events - meaning events with event-specific filter bits.  This would be applicable when event_idx.type==2 (aka RAW type events).  In our implementation, for example, part of the 'code' field would specify a particular type of event and another part of the 'code' field would specify filter bits to provide the flexibility in only counting selected sub-categories of that type of event.

 

Secondly, this proposal seems to only provide event_idx as information to be written into an hpmevent CSR (in the case of hardware counters)?  It would be desirable to have another parameter (e.g. event_info) that can be passed through this API to the eventual hpmevent CSR write.  One could imagine event_idx and event_info being concatenated to create what is written into a 32b or 64-bit hpmeventX CSR.

 

For example, in RV64, this could result in writing the 64-bit value {event_info[43:0], event_idx[19:0]} into hpmeventX  (assuming the above increase in event_idx size).  This provides a standard way for software to configure an entire hpmevent CSR.

 

Greg

 

On Wed, Jul 1, 2020 at 5:26 AM Anup Patel <anup.patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HW performance
counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V
CPU can allow monitoring HW events using few HPMCOUNTER CSRs. The M-mode
software can also inhibit unused performance counters to save energy.

In addition to HW performance counters, a SBI implementation (e.g. OpenSBI,
Xvisor, KVM, etc) can provide software counters for interesting events
such as number of RFENCEs, number of IPIs, number of misaligned load/store
instructions, number of illegal instructions, etc.

We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET CSR,
HPMCOUNTER CSRs and software counters of the SBI implementation.

To define SBI PMU extension, we first define counter_idx which is a unique
number assigned to a counter and event_idx which is an encoded number
representing event to be monitored.

The SBI PMU event_idx is 15bit number encoded as follows:
event_idx[14:12] = type
event_idx[11:0] = code

If event_idx.type == 0 then it is HARDWARE event and event_idx.code can
be one of the following:
enum sbi_pmu_hw_id {
    /*
     * Common hardware events, generalized by the kernel:
     */
    PERF_COUNT_HW_CPU_CYCLES        = 0,
    PERF_COUNT_HW_INSTRUCTIONS        = 1,
    PERF_COUNT_HW_CACHE_REFERENCES        = 2,
    PERF_COUNT_HW_CACHE_MISSES        = 3,
    PERF_COUNT_HW_BRANCH_INSTRUCTIONS    = 4,
    PERF_COUNT_HW_BRANCH_MISSES        = 5,
    PERF_COUNT_HW_BUS_CYCLES        = 6,
    PERF_COUNT_HW_STALLED_CYCLES_FRONTEND    = 7,
    PERF_COUNT_HW_STALLED_CYCLES_BACKEND    = 8,
    PERF_COUNT_HW_REF_CPU_CYCLES        = 9,

    PERF_COUNT_HW_MAX,            /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 1 then it is CACHE event and event_idx.code is encoded
as follows:
event_idx.code[11:4] = cache_id
event_idx.code[3:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    PERF_COUNT_HW_CACHE_L1D            = 0,
    PERF_COUNT_HW_CACHE_L1I            = 1,
    PERF_COUNT_HW_CACHE_LL            = 2,
    PERF_COUNT_HW_CACHE_DTLB        = 3,
    PERF_COUNT_HW_CACHE_ITLB        = 4,
    PERF_COUNT_HW_CACHE_BPU            = 5,
    PERF_COUNT_HW_CACHE_NODE        = 6,

    PERF_COUNT_HW_CACHE_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    PERF_COUNT_HW_CACHE_OP_READ        = 0,
    PERF_COUNT_HW_CACHE_OP_WRITE        = 1,
    PERF_COUNT_HW_CACHE_OP_PREFETCH        = 2,

    PERF_COUNT_HW_CACHE_OP_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    PERF_COUNT_HW_CACHE_RESULT_ACCESS    = 0,
    PERF_COUNT_HW_CACHE_RESULT_MISS        = 1,

    PERF_COUNT_HW_CACHE_RESULT_MAX,        /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 2 then it is RAW event and event_idx.code is just
a RAW event number.

In future, more event_idx can be defined without breaking ABI compatibility
of SBI calls.

Based on above definition of counter_idx definition, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_DESCRIBE
   This call takes two parameters: 1) counter_idx 2) physical address of 4k page
   It will write the description of SBI PMU counter at specified physical
   address. The details of the SBI PMU counter written at specified physical
   address are as follows:
   1. Name (64 bytes)
   2. CSR_Offset (4 bytes)
      (E.g. CSR_Offset == 0x2 imply CSR 0xC02)
      (E.g. CSR_Offset == 0xffffffff means it is SBI implementation counter)
   3. CSR_Width (4 bytes)
      (Number of CSR bits implemented in HW)
   4. Event bitmap (2048 bytes) (i.e. 1-bit for each possible event_idx)
      (If bit corresponding to a event_idx is 1 then event_idx is supported
       by the counter)
   5. Anything else ??
3. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters: 1) counter_idx 2) physical address
   It will set the physical address where SBI implementation will write
   the software counter. This SBI call is only for counters not mapped
   to any CSR (i.e. only for counters with CSR_Offset == 0xffffffff).
4. SBI_PMU_COUNTER_START
   This call takes two parameters: 1) counter_idx 2) event_idx
   It will inform SBI implementation to configure and start/enable specified
   counter on the calling HART to monitor specific event. This SBI call will
   fail for counters which are not present.
5. SBI_PMU_COUNTER_STOP
   This call takes one parameter: 1) counter_idx
   It will inform SBI implementation to stop/disable specified counters on the
   calling HART. This SBI call will fail for counters which are not present.

From above, the RISC-V PMU driver will use most of the SBI calls at boot time.
Only SBI_PMU_COUNTER_START to be used once before using the counter. The reading
the counter is by reading CSR (for CSR_Offset != 0xffffffff) OR by reading
memory location (for CSR_Offset == 0xffffffff). The counter overflow handling
will have to be done in software by Linux kernel.

The information returned by SBI_PMU_NUM_COUNTERS and SBI_PMU_COUNTER_DESCRIBE
can be passed via DT/ACPI but it will be difficult to maintain because we
have hardware counters and SBI implementation counters both provided by SBI
PMU extension. The SBI implementation counters are specific to underlying
SBI implementation so we will have to keep counters/events described in
DT/ACPI in-sync with underlying SBI implementation.

Regards,
Anup



Re: Proposal: SBI PMU Extension

Greg Favor
 

On Wed, Jul 1, 2020 at 9:38 PM Anup Patel <anup.patel@...> wrote:
It is also possible that a RISC-V implementation has few HPMCOUNTER
CSRs but lot of HW events to be monitored. In this case, the RISC-V
implementation will have implementation specific CSR to select a
particular HW event to be monitored in HPMCOUNTER.

But that is the point of the standard HPMEVENT CSRs - each HPMCOUNTER CSR has an associated HPMEVENT CSR.  With the expectation that the HPMEVENT CSR, in some manner, selects the event to be counted by the associated HPMCOUNTER CSR.

The existence of these HPMEVENT CSRs in the architecture underlies the comments in my preceding emails (starting with the event_idx field going into the HPMEVENT CSR for a counter)

Lastly, maybe some implementations will also have additional implementation-specific CSRs, but the HPMEVENT CSRs provide a lot of space for custom stuff and I would imagine it would be rare for there to be additional custom per-HPMCOUNTER CSRs.  Hence the following hooks would probably never be used.

The SBI implementation (i.e. OpenSBI) will provide optional platform hooks which will be
called for SBI_PMU_COUNTER_START and SBI_PMU_COUNTER_STOP calls.  

Greg

 


Re: Proposal: SBI PMU Extension

Greg Favor
 

Why have CSR_number be a 32-bit value?  Even a 16-bit value would be way, way more than enough.  One could argue that even a byte - providing for up to 256 counters - is more than enough.

With a 16-bit CSR_number, you could use the lower half of this number space for hardware counters (up to 32K counters), and use the upper half for software counters (up to 32K counters).  Ditto for an 8-bit CSR_number.

Greg

On Wed, Jul 1, 2020 at 9:38 PM Anup Patel <anup.patel@...> wrote:
I agree CSR_offset is little confusing. Let's have CSR_number instead
of CSR_offset. This way we can even use RISC-V implementation specific
CSR (i.e. non-HPMCOUNTER CSR) as counter. All counters with
CSR_number > 0xfff will be treated as SBI implementation counter.


Re: Proposal: SBI PMU Extension

alankao
 

Hi Greg,
We want to avoid and plan on avoiding having any perf mon related custom (aka implementation-specific) CSR's.  The existing hpmeventX CSR's, especially in RV64, each provide a lot of space for putting whatever custom per-counter controls/etc. that one might want to have.  Hence no need to have separate custom CSR's.

I disagree.  Are you proposing a way for software to overload some CSRs that were specifically named HPMEVENT?  No matter what ...

Please check my comment at https://lkml.org/lkml/2020/6/30/1327 , which happens in Linux mailing list.  We have implemented the proposal and our HW/SW solution have been used since two years ago.  In the proposal, we added a few CSRs to support `perf stat` and `perf record` while ensuring that porting perf_event for RISC-V can be done without too many tricks on a single-Linux machine. 

I strongly disagree with this statement.  It limits the development of RISC-V hardware and software implementation at this stage.


Re: Proposal: SBI PMU Extension

Greg Favor
 

In the following step #3, I think in most implementations the event selection would be a matter of writing the event_idx information into the hpmeventX CSR associated with hpmcounterX.  The platform-specific aspect would be reformatting the event_idx information into the appropriate format for writing into the HPMEVENT CSR (along with any event_info information).

Greg


On Wed, Jul 1, 2020 at 9:38 PM Anup Patel <anup.patel@...> wrote:
For HW counters (i.e. HPMCOUTNER CSRs), the SBI_PMU_COUNTER_START call will:
1. It will enable access to CSR using MCOUNTEREN CSR (or HCOUNTEREN CSR for hypervisor)
2. It will disable inhibit using MCOUNTINHIT CSR
3. Do any platform specific event selection for the specified HW counter


Re: Proposal: SBI PMU Extension

Zong Li
 

On Thu, Jul 2, 2020 at 4:00 PM Greg Favor <gfavor@...> wrote:

In the following step #3, I think in most implementations the event selection would be a matter of writing the event_idx information into the hpmeventX CSR associated with hpmcounterX. The platform-specific aspect would be reformatting the event_idx information into the appropriate format for writing into the HPMEVENT CSR (along with any event_info information).
It seems to me that we should figure out the usage of event_idx. Who
is responsible to translate the event_idx to the real hw event of each
platform and how to translate it.

1. Monitor generic hardware event (including generic hardware cache event)
If we want to count PERF_COUNT_HW_BRANCH_MISSES, we have to pass
event_idx == 0x0005
from s-mode, then each platform of OpenSBI has to translate the
event_idx.code to their own encoding
of hw events (for example, 0x1234), and set the mhpmeventN to 0x1234.

The problem is how does each platform know the mapping of generic HW
events and their own hw events.
The one of the ways is getting the mapping from DT, (see the proposal
of DT for PMU), another is defining
the mapping statically in implementation of each platform.

2. Monitor raw event
If we want to count raw event (doesn't through generic HW event), we
have to pass event_idx == 0x2xxx
from s-mode, then each platform of OpenSBI has to extract the
event_idx.code and set it to mhpmeventN.

The problem is that the length of event_idx.code isn't enough when the
encoding of raw event is defined
more than 15-bits in some platform. Spec doesn't limit the encoding of
mhpmeventN, so you can use
full 64-bit for it.

Please correct me if I have any misunderstandings.

Greg

On Wed, Jul 1, 2020 at 9:38 PM Anup Patel <anup.patel@...> wrote:

For HW counters (i.e. HPMCOUTNER CSRs), the SBI_PMU_COUNTER_START call will:
1. It will enable access to CSR using MCOUNTEREN CSR (or HCOUNTEREN CSR for hypervisor)
2. It will disable inhibit using MCOUNTINHIT CSR
3. Do any platform specific event selection for the specified HW counter


Re: Unix platform working group future agenda (to be discussed in next meeting (06/09 8AM PST))

Josh Scheid
 


On Wed, Jun 3, 2020 at 4:46 PM Tommy Thorn <tommy.thorn@...> wrote:
The Unix spec has so far (AFAIUI) not required anything about DRAM starting address as long as it's stated in the DT.
Esperanto Tech's Maxion cannot have DRAM below 4 GB as that memory space is reserved.
We don't seem to have run into assumptions in the Linux kernel nor openSBI.

It was a deal in the last decade.  There was concern for support for boot devices like non-64-bit addressing capable USB, 32-bit BARs in PCI and possibly other such older protocols.  Without requiring IOMMUs to be able to redirect those devices, or platform-specific PA-PCI address fixed offset functionality, things get difficult.  A solution is to enable allocating both some system memory and some IO (e.g., PCI space) in the low 4 GiB. 

This isn't a problem for systems that control all attached devices, but for systems that want to have robust support for plug-in stuff (USB, PCI, SATA?), it's a consideration.

The platform spec can not address this, but the group should do so consciously.

-Josh


Re: Proposal: SBI PMU Extension

Anup Patel
 

-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 02 July 2020 11:41
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman
<andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

On Thu, Jul 2, 2020 at 12:38 PM Anup Patel <anup.patel@...> wrote:



-----Original Message-----
From: tech-unixplatformspec@... <tech-
unixplatformspec@...> On Behalf Of Zong Li
Sent: 01 July 2020 20:32
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman
<andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU
Extension

On Wed, Jul 1, 2020 at 8:26 PM Anup Patel <anup.patel@...>
wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HW
performance counters such as CYCLE CSR, INSTRET CSR, and
HPMCOUNTER CSRs. A RISC-V CPU can allow monitoring HW events
using
few
HPMCOUNTER
CSRs. The M-mode software can also inhibit unused performance
counters
to save energy.

Do we need the SBI calls to set the mcounteren and mcountinhibit
(optional CSR)?
OR these two CSRs shouldn't be changed at runtime from s-moe?
The SBI_PMU_COUNTER_START call will set/clear appropriate bits in
MCOUNTEREN and MCOUNTINHIBIT CSRs. The SBI_PMU_COUNTER_STOP
will do
the reverse for MCOUNTEREN and MCOUNTINHIBIT CSRs.

It is also possible that a RISC-V implementation has few HPMCOUNTER
CSRs but lot of HW events to be monitored. In this case, the RISC-V
implementation will have implementation specific CSR to select a
particular HW event to be monitored in HPMCOUNTER. The SBI
implementation (i.e. OpenSBI) will provide optional platform hooks
which will be called for SBI_PMU_COUNTER_START and
SBI_PMU_COUNTER_STOP calls.



In addition to HW performance counters, a SBI implementation (e.g.
OpenSBI, Xvisor, KVM, etc) can provide software counters for
interesting events such as number of RFENCEs, number of IPIs,
number of misaligned load/store instructions, number of illegal
instructions, etc.

I'm not sure whether I misunderstood the usage of software counter,
I don't see the event_idxs of these events for software counters,
maybe we could define the event_idxs for them in this proposal, for
example, if event_idx.type == 3, then it is SOFTWARE event, and
event_idx.code is used to specify the monitoring events such as
number of RFENCEs, number of IPIs and so on.
My bad, I forgot to event_idx for SBI implementation events. I will
update in next version. Thanks for catching.



We propose SBI PMU extension which tries to cover CYCLE CSR,
INSTRET CSR, HPMCOUNTER CSRs and software counters of the SBI
implementation.

To define SBI PMU extension, we first define counter_idx which is
a unique number assigned to a counter and event_idx which is an
encoded number representing event to be monitored.

The SBI PMU event_idx is 15bit number encoded as follows:
event_idx[14:12] = type
event_idx[11:0] = code

If event_idx.type == 0 then it is HARDWARE event and
event_idx.code can be one of the following:
enum sbi_pmu_hw_id {
/*
* Common hardware events, generalized by the kernel:
*/
PERF_COUNT_HW_CPU_CYCLES = 0,
PERF_COUNT_HW_INSTRUCTIONS = 1,
PERF_COUNT_HW_CACHE_REFERENCES = 2,
PERF_COUNT_HW_CACHE_MISSES = 3,
PERF_COUNT_HW_BRANCH_INSTRUCTIONS = 4,
PERF_COUNT_HW_BRANCH_MISSES = 5,
PERF_COUNT_HW_BUS_CYCLES = 6,
PERF_COUNT_HW_STALLED_CYCLES_FRONTEND = 7,
PERF_COUNT_HW_STALLED_CYCLES_BACKEND = 8,
PERF_COUNT_HW_REF_CPU_CYCLES = 9,

PERF_COUNT_HW_MAX, /* non-ABI */
};
(NOTE: Same as described in
<linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 1 then it is CACHE event and event_idx.code
is encoded as follows:
event_idx.code[11:4] = cache_id
event_idx.code[3:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
PERF_COUNT_HW_CACHE_L1D = 0,
PERF_COUNT_HW_CACHE_L1I = 1,
PERF_COUNT_HW_CACHE_LL = 2,
PERF_COUNT_HW_CACHE_DTLB = 3,
PERF_COUNT_HW_CACHE_ITLB = 4,
PERF_COUNT_HW_CACHE_BPU = 5,
PERF_COUNT_HW_CACHE_NODE = 6,

PERF_COUNT_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
PERF_COUNT_HW_CACHE_OP_READ = 0,
PERF_COUNT_HW_CACHE_OP_WRITE = 1,
PERF_COUNT_HW_CACHE_OP_PREFETCH = 2,

PERF_COUNT_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
PERF_COUNT_HW_CACHE_RESULT_ACCESS = 0,
PERF_COUNT_HW_CACHE_RESULT_MISS = 1,

PERF_COUNT_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as described in
<linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 2 then it is RAW event and event_idx.code is
just a RAW event number.

In future, more event_idx can be defined without breaking ABI
compatibility of SBI calls.

Based on above definition of counter_idx definition, we can
potentially have the following SBI calls:
I don't see the definition of counter_idx, but I remember that you
had talked about that in other places as follows:
1. counter_idx = 0 to 2 are for CYCLE, TIME, and INSTRET 2.
counter_idx = 3 to
31 are for HPMCOUNTERs 3. counter_idx >= 32 are for software
counters
Initially, I had tied counter_idx with CSR numbers but TIME CSR will
need to be handled as special case.

It's better to treat counter_idx as logical index of available counter
this also helps reduce SBI calls.



1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs 2.
SBI_PMU_COUNTER_DESCRIBE
This call takes two parameters: 1) counter_idx 2) physical
address of 4k
page
It will write the description of SBI PMU counter at specified physical
address. The details of the SBI PMU counter written at specified
physical
address are as follows:
1. Name (64 bytes)
2. CSR_Offset (4 bytes)
(E.g. CSR_Offset == 0x2 imply CSR 0xC02)
(E.g. CSR_Offset == 0xffffffff means it is SBI
implementation
counter)
Maybe it would be more clear if we use counter_idx instead of
CSR_Offset?

See my previous comment.

I agree CSR_offset is little confusing. Let's have CSR_number instead
of CSR_offset. This way we can even use RISC-V implementation specific
CSR (i.e. non-HPMCOUNTER CSR) as counter. All counters with
CSR_number
0xfff will be treated as SBI implementation counter.

3. CSR_Width (4 bytes)
(Number of CSR bits implemented in HW)
4. Event bitmap (2048 bytes) (i.e. 1-bit for each possible event_idx)
(If bit corresponding to a event_idx is 1 then event_idx is supported
by the counter)
Is there more detail about the corresponding bit of events? for
example, the bit 0 corresponds to event_idx 0x0, and bit 10
corresponds to event_idx 0x1000.
It's a bitmap representing all possible event_idx values. If bit X is
set then it means event_idx = X can be monitored by this counter.


5. Anything else ??
3. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters: 1) counter_idx 2) physical address
It will set the physical address where SBI implementation will write
the software counter. This SBI call is only for counters not mapped
to any CSR (i.e. only for counters with CSR_Offset == 0xffffffff).
4. SBI_PMU_COUNTER_START
This call takes two parameters: 1) counter_idx 2) event_idx
It will inform SBI implementation to configure and start/enable
specified
counter on the calling HART to monitor specific event. This SBI call will
fail for counters which are not present.
Just want to make sure whether SBI_PMU_COUNTER_START is for
hardware
and software counters? if so, we should define the event_idx for
software counters as I mentioned above.
Yes, it's for both HW and SW counters.

For HW counters (i.e. HPMCOUTNER CSRs), the
SBI_PMU_COUNTER_START call will:
1. It will enable access to CSR using MCOUNTEREN CSR (or HCOUNTEREN
CSR for hypervisor) 2. It will disable inhibit using MCOUNTINHIT CSR
3. Do any platform specific event selection for the specified HW
counter


5. SBI_PMU_COUNTER_STOP
This call takes one parameter: 1) counter_idx
It will inform SBI implementation to stop/disable specified
counters on
the
calling HART. This SBI call will fail for counters which are not present.

From above, the RISC-V PMU driver will use most of the SBI calls
at boot
time.
Only SBI_PMU_COUNTER_START to be used once before using the
counter.
The reading the counter is by reading CSR (for CSR_Offset !=
0xffffffff) OR by reading memory location (for CSR_Offset ==
0xffffffff). The counter overflow handling will have to be done in
software
by Linux kernel.

The information returned by SBI_PMU_NUM_COUNTERS and
SBI_PMU_COUNTER_DESCRIBE can be passed via DT/ACPI but it will be
difficult to maintain because we have hardware counters and SBI
implementation counters both provided by SBI PMU extension. The
SBI implementation counters are specific to underlying SBI
implementation so we will have to keep counters/events described
in DT/ACPI in-sync with
underlying SBI implementation.

I have a proposal for DT format of PMU, it seems to me that we need
to add some information for software counters, such as the number of
software counters and it's events, are there any ideas?
With the SBI_PMU_COUNTER_DESCRIBE call, we don't need to pass this
information in DT.

I think the DT format you had proposed seems to have following
limitations:
1. It maps each counter to a particular HW event. In reality, we will
have few HW counters and lots of HW events and a HW counter can be
configured to monitor a particular event from a set of HW events. In
other words, relation between HW counter and HW event is one-to-many
and not one-to-one.
2. It does not deal with implementation specific HW events.
Sorry for unclear description in DT binding file. The event mapping of
property maps generic HW events of kernel to particular HW events of
platform, doesn't map to specific counters, so it doesn't have the limitations
above.
In addition to mapping generic events to HW events. We also need to known
which set of generic events are supported by a given counter CSR. A RISC-V
implementation can have different set of events supported on different
counters.

For example, CYCLE counter only counts CPU cycles whereas INSTRET counter
only counts instructions. Both CYCLE and INSTRET cannot capture any other
HW event.


I thought that OpenSBI would fill the content of
SBI_PMU_COUNTER_DESCRIBE by parsing DT file. So actually, does the
information be defined statically by each firmware in OpenSBI?
The OpenSBI platform code will have choice to either parse DT or
write C source for :
1. Mapping generic events to HW events
2. Mapping set of events to a counter

We will certainly prefer parsing DT for OpenSBI generic platform.

The Linux kernel SBI PMU driver (S-mode software) should always deal
with generic events so that there is no SOC specific stuff in the Linux
driver.

I will send v2 proposal soon and we can continue discussion over there.

Regards,
Anup


Re: Proposal: SBI PMU Extension

Anup Patel
 

Hi Greg,

 

I support your suggestion for re-using hpmeventX CSRs for both selecting and filtering events but I am worried about RV32 because hpmeventX CSRs will be 32bit on RV32. Due to this reason, I have kept event_idx and event_info as separate parameters in SBI_PMU_COUNTER_START call. It will be RISC-V implementation choice whether to pack both event_idx and event_info into one hpmeventX CSR or have additional implementation specific CSRs.

 

Your suggestion of 16bit event_idx.code looks good to me. I think event_idx can be 20 bits for future expansion of new event types. The event_idx[19:16] can be type and event_idx[15:0] can be code. We will have event_idx[63:20] (44bits) on RV64 and event_idx[31:20] (12bits) on RV32 for systems that want to pack both event_idx and event_info in hpmeventX CSR. Are you okay with this??

 

The event_info parameter will be opaque for SBI_PMU_COUNTER_START call but RV32 implementations will have to be careful because we won’t have much space to fit event_info in hpmeventX CSR on RV32.

 

Regarding bitmap vs list/array for SBI_PMU_COUNTER_DESCRIBE, I am leaning more towards list/array because bitmaps will tend to take more memory and it will also forcing us to keep event_idx small.

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 02 July 2020 12:59
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

Just to clarify a couple of things as to where I was coming from (which I think would mirror other people that also want to have more than just an event selector field in the hpmevent CSR's):

 

We want to avoid and plan on avoiding having any perf mon related custom (aka implementation-specific) CSR's.  The existing hpmeventX CSR's, especially in RV64, each provide a lot of space for putting whatever custom per-counter controls/etc. that one might want to have.  Hence no need to have separate custom CSR's.

 

Our thought would be for a 16-bit event_idx.code field to go into hpmeventX[15:0] and the 3-bit event_idx.type field to go into hpmeventX[18:16].  About as straightforward and simple as one could have.

 

Then hpmeventX[63:19] are free for whatever other custom stuff that an implementation might want to have.  In our case we prefer to keep everything within the low 32 CSR bits - which still leaves 13 bits for most all of what we want to have (and what others might want to have).

 

It sounds like you aren't trying to say anything in the API about where event_idx ends up within an hpmeventX CSR, but it would be good to keep the size of the event_idx value bounded - like you originally did (but to 19 bits instead of 15 bits).  And have event_info[44:0] (or just provide a full 64-bit event_info parameter).

 

Then event_idx and event_info can be concatenated - as {event_info:44:0], event_idx{18:0]} - to create the 64-bit value to write into an hpmeventX CSR.  (Or {event_info:12:0], event_idx{18:0]} - to create the 32-bit value for an RV32 hpmeventX CSR.)  Not that you would require that packing arrangement (although it might encourage a standard hpmeventX format for future implementations).  And this allows full flexibility for people to stick whatever implementation-specific stuff they want into the upper hpmeventX bits.

 

Keeping event_idx to 16 bits would also contain the max size of the list/array that  SBI_PMU_COUNTER_DESCRIBE would use.  Or you can go back to bitmaps for each event_idx.type.  type==2 would require up to 4096 bytes.  type==0 and type==1 would only require small bitmaps.

 

Greg

 

On Wed, Jul 1, 2020 at 9:54 PM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

Thanks for the feedbacks and requests.

 

I had kept event_idx to be just 15bits so that all possible events can be represented by a bitmap of 2048 bytes. I certainly see the advantage of allowing bigger values for event_idx. I will update SBI_PMU_COUNTER_DESCRIBE call to use a list/array instead of bitmap which will allow us to have 64bit event_idx.

 

Regarding the event_info suggestion, I will extend SBI_PMU_COUNTER_START call to have one more parameter for event_info. Previously, I did not consider RISC-V implementation having HW filtering of events but this is a good feature to have. Thanks for suggesting.

 

Also, over here we are only proposing SBI calls which M-mode software (OpenSBI) and Hypervisors (KVM/Xvisor) can implement. Your RISC-V implementation can have implementation specific CSRs (hpmevent or hpmfilter, etc) for configuring HPMCOUNTER.

 

Regards,

Anup

 

From: tech-unixplatformspec@... <tech-unixplatformspec@...> On Behalf Of Greg Favor
Sent: 02 July 2020 00:04
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

This is great to see - as part of standardizing how RISC-V HPM counters are configured and controlled by softare

 

I have a modest but important request:  Increase the size of the event_idx 'code' field from event_idx[11:0] to event_idx[15:0].  This is for two reasons:

 

- As with the size of the event_idx 'type' field, this allows a good amount of space for future growth, especially as more architecture extensions come along and motivate having additional events (starting with the vector, hypervisor, and bitmanip extensions).

 

- This allows space and flexibility for things like having "structured" events - meaning events with event-specific filter bits.  This would be applicable when event_idx.type==2 (aka RAW type events).  In our implementation, for example, part of the 'code' field would specify a particular type of event and another part of the 'code' field would specify filter bits to provide the flexibility in only counting selected sub-categories of that type of event.

 

Secondly, this proposal seems to only provide event_idx as information to be written into an hpmevent CSR (in the case of hardware counters)?  It would be desirable to have another parameter (e.g. event_info) that can be passed through this API to the eventual hpmevent CSR write.  One could imagine event_idx and event_info being concatenated to create what is written into a 32b or 64-bit hpmeventX CSR.

 

For example, in RV64, this could result in writing the 64-bit value {event_info[43:0], event_idx[19:0]} into hpmeventX  (assuming the above increase in event_idx size).  This provides a standard way for software to configure an entire hpmevent CSR.

 

Greg

 

On Wed, Jul 1, 2020 at 5:26 AM Anup Patel <anup.patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HW performance
counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V
CPU can allow monitoring HW events using few HPMCOUNTER CSRs. The M-mode
software can also inhibit unused performance counters to save energy.

In addition to HW performance counters, a SBI implementation (e.g. OpenSBI,
Xvisor, KVM, etc) can provide software counters for interesting events
such as number of RFENCEs, number of IPIs, number of misaligned load/store
instructions, number of illegal instructions, etc.

We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET CSR,
HPMCOUNTER CSRs and software counters of the SBI implementation.

To define SBI PMU extension, we first define counter_idx which is a unique
number assigned to a counter and event_idx which is an encoded number
representing event to be monitored.

The SBI PMU event_idx is 15bit number encoded as follows:
event_idx[14:12] = type
event_idx[11:0] = code

If event_idx.type == 0 then it is HARDWARE event and event_idx.code can
be one of the following:
enum sbi_pmu_hw_id {
    /*
     * Common hardware events, generalized by the kernel:
     */
    PERF_COUNT_HW_CPU_CYCLES        = 0,
    PERF_COUNT_HW_INSTRUCTIONS        = 1,
    PERF_COUNT_HW_CACHE_REFERENCES        = 2,
    PERF_COUNT_HW_CACHE_MISSES        = 3,
    PERF_COUNT_HW_BRANCH_INSTRUCTIONS    = 4,
    PERF_COUNT_HW_BRANCH_MISSES        = 5,
    PERF_COUNT_HW_BUS_CYCLES        = 6,
    PERF_COUNT_HW_STALLED_CYCLES_FRONTEND    = 7,
    PERF_COUNT_HW_STALLED_CYCLES_BACKEND    = 8,
    PERF_COUNT_HW_REF_CPU_CYCLES        = 9,

    PERF_COUNT_HW_MAX,            /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 1 then it is CACHE event and event_idx.code is encoded
as follows:
event_idx.code[11:4] = cache_id
event_idx.code[3:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    PERF_COUNT_HW_CACHE_L1D            = 0,
    PERF_COUNT_HW_CACHE_L1I            = 1,
    PERF_COUNT_HW_CACHE_LL            = 2,
    PERF_COUNT_HW_CACHE_DTLB        = 3,
    PERF_COUNT_HW_CACHE_ITLB        = 4,
    PERF_COUNT_HW_CACHE_BPU            = 5,
    PERF_COUNT_HW_CACHE_NODE        = 6,

    PERF_COUNT_HW_CACHE_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    PERF_COUNT_HW_CACHE_OP_READ        = 0,
    PERF_COUNT_HW_CACHE_OP_WRITE        = 1,
    PERF_COUNT_HW_CACHE_OP_PREFETCH        = 2,

    PERF_COUNT_HW_CACHE_OP_MAX,        /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    PERF_COUNT_HW_CACHE_RESULT_ACCESS    = 0,
    PERF_COUNT_HW_CACHE_RESULT_MISS        = 1,

    PERF_COUNT_HW_CACHE_RESULT_MAX,        /* non-ABI */
};
(NOTE: Same as described in <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 2 then it is RAW event and event_idx.code is just
a RAW event number.

In future, more event_idx can be defined without breaking ABI compatibility
of SBI calls.

Based on above definition of counter_idx definition, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_DESCRIBE
   This call takes two parameters: 1) counter_idx 2) physical address of 4k page
   It will write the description of SBI PMU counter at specified physical
   address. The details of the SBI PMU counter written at specified physical
   address are as follows:
   1. Name (64 bytes)
   2. CSR_Offset (4 bytes)
      (E.g. CSR_Offset == 0x2 imply CSR 0xC02)
      (E.g. CSR_Offset == 0xffffffff means it is SBI implementation counter)
   3. CSR_Width (4 bytes)
      (Number of CSR bits implemented in HW)
   4. Event bitmap (2048 bytes) (i.e. 1-bit for each possible event_idx)
      (If bit corresponding to a event_idx is 1 then event_idx is supported
       by the counter)
   5. Anything else ??
3. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters: 1) counter_idx 2) physical address
   It will set the physical address where SBI implementation will write
   the software counter. This SBI call is only for counters not mapped
   to any CSR (i.e. only for counters with CSR_Offset == 0xffffffff).
4. SBI_PMU_COUNTER_START
   This call takes two parameters: 1) counter_idx 2) event_idx
   It will inform SBI implementation to configure and start/enable specified
   counter on the calling HART to monitor specific event. This SBI call will
   fail for counters which are not present.
5. SBI_PMU_COUNTER_STOP
   This call takes one parameter: 1) counter_idx
   It will inform SBI implementation to stop/disable specified counters on the
   calling HART. This SBI call will fail for counters which are not present.

From above, the RISC-V PMU driver will use most of the SBI calls at boot time.
Only SBI_PMU_COUNTER_START to be used once before using the counter. The reading
the counter is by reading CSR (for CSR_Offset != 0xffffffff) OR by reading
memory location (for CSR_Offset == 0xffffffff). The counter overflow handling
will have to be done in software by Linux kernel.

The information returned by SBI_PMU_NUM_COUNTERS and SBI_PMU_COUNTER_DESCRIBE
can be passed via DT/ACPI but it will be difficult to maintain because we
have hardware counters and SBI implementation counters both provided by SBI
PMU extension. The SBI implementation counters are specific to underlying
SBI implementation so we will have to keep counters/events described in
DT/ACPI in-sync with underlying SBI implementation.

Regards,
Anup


Re: Proposal: SBI PMU Extension

Anup Patel
 

I agree HPMEVENT CSRs are sufficient for most RISC-V system assuming RV64 will be more common compared to RV32.

 

The M-mode runtime firmware (OpenSBI) will have to deal with implementation specific CSRs in platform specific way.

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 02 July 2020 13:14
To: Anup Patel <Anup.Patel@...>
Cc: Zong Li <zong.li@...>; tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

On Wed, Jul 1, 2020 at 9:38 PM Anup Patel <anup.patel@...> wrote:

It is also possible that a RISC-V implementation has few HPMCOUNTER
CSRs but lot of HW events to be monitored. In this case, the RISC-V
implementation will have implementation specific CSR to select a
particular HW event to be monitored in HPMCOUNTER.

 

But that is the point of the standard HPMEVENT CSRs - each HPMCOUNTER CSR has an associated HPMEVENT CSR.  With the expectation that the HPMEVENT CSR, in some manner, selects the event to be counted by the associated HPMCOUNTER CSR.

 

The existence of these HPMEVENT CSRs in the architecture underlies the comments in my preceding emails (starting with the event_idx field going into the HPMEVENT CSR for a counter)

 

Lastly, maybe some implementations will also have additional implementation-specific CSRs, but the HPMEVENT CSRs provide a lot of space for custom stuff and I would imagine it would be rare for there to be additional custom per-HPMCOUNTER CSRs.  Hence the following hooks would probably never be used.

 

The SBI implementation (i.e. OpenSBI) will provide optional platform hooks which will be
called for SBI_PMU_COUNTER_START and SBI_PMU_COUNTER_STOP calls.  

 

Greg

 

 


Re: Proposal: SBI PMU Extension

Anup Patel
 

You are correct, 16bit CSR_number is sufficient.

 

I am thinking of using lower 12bit of CSR_number as absolute RISC-V CSR number. This way we can point to any (even implementation specific) CSR as counter CSR (although this will generally point to hpmcounterX CSRs). If CSR_number > 0xfff then we treat the counter as software counter provided by SBI implementation.

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 02 July 2020 13:24
To: Anup Patel <Anup.Patel@...>
Cc: Zong Li <zong.li@...>; tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Why have CSR_number be a 32-bit value?  Even a 16-bit value would be way, way more than enough.  One could argue that even a byte - providing for up to 256 counters - is more than enough.

 

With a 16-bit CSR_number, you could use the lower half of this number space for hardware counters (up to 32K counters), and use the upper half for software counters (up to 32K counters).  Ditto for an 8-bit CSR_number.

 

Greg

 

On Wed, Jul 1, 2020 at 9:38 PM Anup Patel <anup.patel@...> wrote:

I agree CSR_offset is little confusing. Let's have CSR_number instead
of CSR_offset. This way we can even use RISC-V implementation specific
CSR (i.e. non-HPMCOUNTER CSR) as counter. All counters with
CSR_number > 0xfff will be treated as SBI implementation counter.


Re: Proposal: SBI PMU Extension

Anup Patel
 

Yes, I think this should be 4-steps for HW counters.

 

The last step#3 will be broken down into programming hpmeventX CSR and optional platform specific progamming.

 

Regards,

Anup

 

From: tech-unixplatformspec@... <tech-unixplatformspec@...> On Behalf Of Greg Favor
Sent: 02 July 2020 13:31
To: Anup Patel <Anup.Patel@...>
Cc: Zong Li <zong.li@...>; tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

In the following step #3, I think in most implementations the event selection would be a matter of writing the event_idx information into the hpmeventX CSR associated with hpmcounterX.  The platform-specific aspect would be reformatting the event_idx information into the appropriate format for writing into the HPMEVENT CSR (along with any event_info information).

 

Greg

 

On Wed, Jul 1, 2020 at 9:38 PM Anup Patel <anup.patel@...> wrote:

For HW counters (i.e. HPMCOUTNER CSRs), the SBI_PMU_COUNTER_START call will:
1. It will enable access to CSR using MCOUNTEREN CSR (or HCOUNTEREN CSR for hypervisor)
2. It will disable inhibit using MCOUNTINHIT CSR
3. Do any platform specific event selection for the specified HW counter


Re: Proposal: SBI PMU Extension

Greg Favor
 

On Sat, Jul 4, 2020 at 1:17 AM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

I support your suggestion for re-using hpmeventX CSRs for both selecting and filtering events but I am worried about RV32 because hpmeventX CSRs will be 32bit on RV32. Due to this reason, I have kept event_idx and event_info as separate parameters in SBI_PMU_COUNTER_START call. It will be RISC-V implementation choice whether to pack both event_idx and event_info into one hpmeventX CSR or have additional implementation specific CSRs.

 

Your suggestion of 16bit event_idx.code looks good to me. I think event_idx can be 20 bits for future expansion of new event types. The event_idx[19:16] can be type and event_idx[15:0] can be code. We will have event_idx[63:20] (44bits) on RV64 and event_idx[31:20] (12bits) on RV32 for systems that want to pack both event_idx and event_info in hpmeventX CSR. Are you okay with this??


Essentially, Yes.  This is a little different that what I had described, but this is equally fine (if not a little simpler).  One can eliminate the separate event_info parameter and just have one 64-bit parameter for both RV64 and RV32 (maybe this is what you intend?):

event_idx.info is event_idx[63:20]
event_idx.type is event_idx[19:16] 

event_idx.code is event_idx[15:0] 

The SBI code can simply write event_idx[63:0] or event_idx[31:0] into mhpmevent (for RV64 and RV32 respectively).  On an RV32 implementation, implementation-specific code could choose to also take all or part of event_idx[63:32] and write that into an implementation-specific CSR.  (Even on an RV64 implementation, implementation-specific code could choose to take part of event_idx and write it into an implementation-specific CSR.)

Or one could have two 32-bit parameters (event_idx and event_info).  Then SBI code in RV64 writes {event_info[31:0], event_idx[19:0]}, or in RV32 writes {event_info[11:0], event_idx[19:0]} to mhpmevent.  On RV32, implementation-specific code could choose to take the rest of event_info and write that into an implementation-specific CSR.

 

The event_info parameter will be opaque for SBI_PMU_COUNTER_START call but RV32 implementations will have to be careful because we won’t have much space to fit event_info in hpmeventX CSR on RV32.


It seems like the preceding could apply here (and anywhere that an event_idx parameter is used to write an mhpmevent CSR.
 
Greg

 

Regarding bitmap vs list/array for SBI_PMU_COUNTER_DESCRIBE, I am leaning more towards list/array because bitmaps will tend to take more memory and it will also forcing us to keep event_idx small.

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 02 July 2020 12:59
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

Just to clarify a couple of things as to where I was coming from (which I think would mirror other people that also want to have more than just an event selector field in the hpmevent CSR's):

 

We want to avoid and plan on avoiding having any perf mon related custom (aka implementation-specific) CSR's.  The existing hpmeventX CSR's, especially in RV64, each provide a lot of space for putting whatever custom per-counter controls/etc. that one might want to have.  Hence no need to have separate custom CSR's.

 

Our thought would be for a 16-bit event_idx.code field to go into hpmeventX[15:0] and the 3-bit event_idx.type field to go into hpmeventX[18:16].  About as straightforward and simple as one could have.

 

Then hpmeventX[63:19] are free for whatever other custom stuff that an implementation might want to have.  In our case we prefer to keep everything within the low 32 CSR bits - which still leaves 13 bits for most all of what we want to have (and what others might want to have).

 

It sounds like you aren't trying to say anything in the API about where event_idx ends up within an hpmeventX CSR, but it would be good to keep the size of the event_idx value bounded - like you originally did (but to 19 bits instead of 15 bits).  And have event_info[44:0] (or just provide a full 64-bit event_info parameter).

 

Then event_idx and event_info can be concatenated - as {event_info:44:0], event_idx{18:0]} - to create the 64-bit value to write into an hpmeventX CSR.  (Or {event_info:12:0], event_idx{18:0]} - to create the 32-bit value for an RV32 hpmeventX CSR.)  Not that you would require that packing arrangement (although it might encourage a standard hpmeventX format for future implementations).  And this allows full flexibility for people to stick whatever implementation-specific stuff they want into the upper hpmeventX bits.

 

Keeping event_idx to 16 bits would also contain the max size of the list/array that  SBI_PMU_COUNTER_DESCRIBE would use.  Or you can go back to bitmaps for each event_idx.type.  type==2 would require up to 4096 bytes.  type==0 and type==1 would only require small bitmaps.

 

Greg



Re: Proposal: SBI PMU Extension

Anup Patel
 

Hi Greg,

 

How about having event_idx as XLEN bits wide??

 

It can be encoded as follows (similar to what you suggested):

event_idx[XLEN-1:16] = info

event_idx[15:12] = type

event_idx[11:0] = code

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 05 July 2020 04:15
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

On Sat, Jul 4, 2020 at 1:17 AM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

I support your suggestion for re-using hpmeventX CSRs for both selecting and filtering events but I am worried about RV32 because hpmeventX CSRs will be 32bit on RV32. Due to this reason, I have kept event_idx and event_info as separate parameters in SBI_PMU_COUNTER_START call. It will be RISC-V implementation choice whether to pack both event_idx and event_info into one hpmeventX CSR or have additional implementation specific CSRs.

 

Your suggestion of 16bit event_idx.code looks good to me. I think event_idx can be 20 bits for future expansion of new event types. The event_idx[19:16] can be type and event_idx[15:0] can be code. We will have event_idx[63:20] (44bits) on RV64 and event_idx[31:20] (12bits) on RV32 for systems that want to pack both event_idx and event_info in hpmeventX CSR. Are you okay with this??

 

Essentially, Yes.  This is a little different that what I had described, but this is equally fine (if not a little simpler).  One can eliminate the separate event_info parameter and just have one 64-bit parameter for both RV64 and RV32 (maybe this is what you intend?):

 

event_idx.info is event_idx[63:20]

event_idx.type is event_idx[19:16] 

event_idx.code is event_idx[15:0] 

 

The SBI code can simply write event_idx[63:0] or event_idx[31:0] into mhpmevent (for RV64 and RV32 respectively).  On an RV32 implementation, implementation-specific code could choose to also take all or part of event_idx[63:32] and write that into an implementation-specific CSR.  (Even on an RV64 implementation, implementation-specific code could choose to take part of event_idx and write it into an implementation-specific CSR.)

 

Or one could have two 32-bit parameters (event_idx and event_info).  Then SBI code in RV64 writes {event_info[31:0], event_idx[19:0]}, or in RV32 writes {event_info[11:0], event_idx[19:0]} to mhpmevent.  On RV32, implementation-specific code could choose to take the rest of event_info and write that into an implementation-specific CSR.

 

 

The event_info parameter will be opaque for SBI_PMU_COUNTER_START call but RV32 implementations will have to be careful because we won’t have much space to fit event_info in hpmeventX CSR on RV32.

 

It seems like the preceding could apply here (and anywhere that an event_idx parameter is used to write an mhpmevent CSR.

 

Greg

 

 

Regarding bitmap vs list/array for SBI_PMU_COUNTER_DESCRIBE, I am leaning more towards list/array because bitmaps will tend to take more memory and it will also forcing us to keep event_idx small.

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 02 July 2020 12:59
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

Just to clarify a couple of things as to where I was coming from (which I think would mirror other people that also want to have more than just an event selector field in the hpmevent CSR's):

 

We want to avoid and plan on avoiding having any perf mon related custom (aka implementation-specific) CSR's.  The existing hpmeventX CSR's, especially in RV64, each provide a lot of space for putting whatever custom per-counter controls/etc. that one might want to have.  Hence no need to have separate custom CSR's.

 

Our thought would be for a 16-bit event_idx.code field to go into hpmeventX[15:0] and the 3-bit event_idx.type field to go into hpmeventX[18:16].  About as straightforward and simple as one could have.

 

Then hpmeventX[63:19] are free for whatever other custom stuff that an implementation might want to have.  In our case we prefer to keep everything within the low 32 CSR bits - which still leaves 13 bits for most all of what we want to have (and what others might want to have).

 

It sounds like you aren't trying to say anything in the API about where event_idx ends up within an hpmeventX CSR, but it would be good to keep the size of the event_idx value bounded - like you originally did (but to 19 bits instead of 15 bits).  And have event_info[44:0] (or just provide a full 64-bit event_info parameter).

 

Then event_idx and event_info can be concatenated - as {event_info:44:0], event_idx{18:0]} - to create the 64-bit value to write into an hpmeventX CSR.  (Or {event_info:12:0], event_idx{18:0]} - to create the 32-bit value for an RV32 hpmeventX CSR.)  Not that you would require that packing arrangement (although it might encourage a standard hpmeventX format for future implementations).  And this allows full flexibility for people to stick whatever implementation-specific stuff they want into the upper hpmeventX bits.

 

Keeping event_idx to 16 bits would also contain the max size of the list/array that  SBI_PMU_COUNTER_DESCRIBE would use.  Or you can go back to bitmaps for each event_idx.type.  type==2 would require up to 4096 bytes.  type==0 and type==1 would only require small bitmaps.

 

Greg

 


Re: Proposal: SBI PMU Extension

Greg Favor
 

That sounds good.

Greg


On Sun, Jul 5, 2020 at 2:21 AM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

How about having event_idx as XLEN bits wide??

 

It can be encoded as follows (similar to what you suggested):

event_idx[XLEN-1:16] = info

event_idx[15:12] = type

event_idx[11:0] = code

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 05 July 2020 04:15
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

On Sat, Jul 4, 2020 at 1:17 AM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

I support your suggestion for re-using hpmeventX CSRs for both selecting and filtering events but I am worried about RV32 because hpmeventX CSRs will be 32bit on RV32. Due to this reason, I have kept event_idx and event_info as separate parameters in SBI_PMU_COUNTER_START call. It will be RISC-V implementation choice whether to pack both event_idx and event_info into one hpmeventX CSR or have additional implementation specific CSRs.

 

Your suggestion of 16bit event_idx.code looks good to me. I think event_idx can be 20 bits for future expansion of new event types. The event_idx[19:16] can be type and event_idx[15:0] can be code. We will have event_idx[63:20] (44bits) on RV64 and event_idx[31:20] (12bits) on RV32 for systems that want to pack both event_idx and event_info in hpmeventX CSR. Are you okay with this??

 

Essentially, Yes.  This is a little different that what I had described, but this is equally fine (if not a little simpler).  One can eliminate the separate event_info parameter and just have one 64-bit parameter for both RV64 and RV32 (maybe this is what you intend?):

 

event_idx.info is event_idx[63:20]

event_idx.type is event_idx[19:16] 

event_idx.code is event_idx[15:0] 

 

The SBI code can simply write event_idx[63:0] or event_idx[31:0] into mhpmevent (for RV64 and RV32 respectively).  On an RV32 implementation, implementation-specific code could choose to also take all or part of event_idx[63:32] and write that into an implementation-specific CSR.  (Even on an RV64 implementation, implementation-specific code could choose to take part of event_idx and write it into an implementation-specific CSR.)

 

Or one could have two 32-bit parameters (event_idx and event_info).  Then SBI code in RV64 writes {event_info[31:0], event_idx[19:0]}, or in RV32 writes {event_info[11:0], event_idx[19:0]} to mhpmevent.  On RV32, implementation-specific code could choose to take the rest of event_info and write that into an implementation-specific CSR.

 

 

The event_info parameter will be opaque for SBI_PMU_COUNTER_START call but RV32 implementations will have to be careful because we won’t have much space to fit event_info in hpmeventX CSR on RV32.

 

It seems like the preceding could apply here (and anywhere that an event_idx parameter is used to write an mhpmevent CSR.

 

Greg

 

 

Regarding bitmap vs list/array for SBI_PMU_COUNTER_DESCRIBE, I am leaning more towards list/array because bitmaps will tend to take more memory and it will also forcing us to keep event_idx small.

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 02 July 2020 12:59
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman <andrew@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal: SBI PMU Extension

 

Anup,

 

Just to clarify a couple of things as to where I was coming from (which I think would mirror other people that also want to have more than just an event selector field in the hpmevent CSR's):

 

We want to avoid and plan on avoiding having any perf mon related custom (aka implementation-specific) CSR's.  The existing hpmeventX CSR's, especially in RV64, each provide a lot of space for putting whatever custom per-counter controls/etc. that one might want to have.  Hence no need to have separate custom CSR's.

 

Our thought would be for a 16-bit event_idx.code field to go into hpmeventX[15:0] and the 3-bit event_idx.type field to go into hpmeventX[18:16].  About as straightforward and simple as one could have.

 

Then hpmeventX[63:19] are free for whatever other custom stuff that an implementation might want to have.  In our case we prefer to keep everything within the low 32 CSR bits - which still leaves 13 bits for most all of what we want to have (and what others might want to have).

 

It sounds like you aren't trying to say anything in the API about where event_idx ends up within an hpmeventX CSR, but it would be good to keep the size of the event_idx value bounded - like you originally did (but to 19 bits instead of 15 bits).  And have event_info[44:0] (or just provide a full 64-bit event_info parameter).

 

Then event_idx and event_info can be concatenated - as {event_info:44:0], event_idx{18:0]} - to create the 64-bit value to write into an hpmeventX CSR.  (Or {event_info:12:0], event_idx{18:0]} - to create the 32-bit value for an RV32 hpmeventX CSR.)  Not that you would require that packing arrangement (although it might encourage a standard hpmeventX format for future implementations).  And this allows full flexibility for people to stick whatever implementation-specific stuff they want into the upper hpmeventX bits.

 

Keeping event_idx to 16 bits would also contain the max size of the list/array that  SBI_PMU_COUNTER_DESCRIBE would use.  Or you can go back to bitmaps for each event_idx.type.  type==2 would require up to 4096 bytes.  type==0 and type==1 would only require small bitmaps.

 

Greg

 


Proposal v2: SBI PMU Extension

Anup Patel
 

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER
CSRs. A RISC-V implementation can support monitoring various HARDWARE
events using limited number of HPMCOUNTER CSRs.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET CSR,
HPMCOUNTER CSRs and SOFTWARE counters provided by SBI implementation.

To define SBI PMU extension, we first define counter_idx which is a
logical number assigned to a counter and event_idx which is an encoded
number representing the HARDWARE/SOFTWARE event to be monitored.

The SBI PMU event_idx is a XLEN bits wide number encoded as follows:
event_idx[XLEN-1:16] = info
event_idx[15:12] = type
event_idx[11:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_idx.info is optional and can be passed zero whereas the
event_idx.code can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_idx.info is optional and can be passed zero
whereas the event_idx.code is encoded as follows:
event_idx.code[11:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW
event, both event_idx.info and event_idx.code are platform dependent.

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event,
event_idx.info is SBI implementation specific and event_idx.code can be
one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking ABI compatibility
of SBI calls.

Using definition of counter_idx and event_idx, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_DESCRIBE
This call takes two parameters: 1) counter_idx 2) physical address
It will write the description of SBI PMU counter at specified physical
address. The details of the SBI PMU counter written at specified
physical address are as follows:
1. Name (64 bytes)
2. CSR_Number (2 bytes)
(CSR_Number <= 0xfff means counter is a RISC-V CSR)
(CSR_Number > 0xfff means counter is a SBI implementation counter)
(E.g. CSR_Number == 0xC02 imply HPMCOUNTER2 CSR)
3. CSR_Width (2 bytes)
(Number of CSR bits implemented in HW)
4. Event_Count (2 bytes)
(Number of events in Event_List array)
5. Event_List (2 * Event_Count bytes)
(This is an array of 16bit values where each 16bit value is the
supported event_idx.type and event_idx.code combination)
3. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters: 1) counter_idx 2) physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number > 0xfff).
4. SBI_PMU_COUNTER_START
This call takes two parameters: 1) counter_idx 2) event_idx
It will inform SBI implementation to configure and start/enable
specified counter on the calling HART to monitor specific event.
This SBI call will fail for counters which are not present and
specified event_idx is not supported by the counter.
5. SBI_PMU_COUNTER_STOP
This call takes one parameter: 1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

From above, the RISC-V PMU driver will use most of the SBI calls at boot
time. Only SBI_PMU_COUNTER_START to be used once before using the counter.
The reading of counter is by reading CSR (for CSR_Number < 0xfff) OR by
reading memory location (for CSR_Offset >= 0xfff). The counter overflow
handling will have to be done in software by Linux kernel.

Using the SBI PMU extension, the M-mode runtime firmware (or Hypervisors)
can provide a standardized view of HARDWARE/SOFTWARE counters and events
to S-mode (or VS-mode) software.

The M-mode runtime firmware (OpenSBI) will need to know following
platform dependent information:
1. Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. HPMCOUNTER)
2. Mapping of event_idx for HARDWARE event to HPMEVENT CSR value
3. Mapping of event_idx for HARDWARE CACHE event to HPMEVENT CSR value
4. Mapping of event_idx for HARDWARE RAW event to HPMEVENT CSR value
5. Additional platform-specific progamming required by any event_idx

All platform dependent information mentioned above, can be obtained
by M-mode runtime firmware (OpenSBI) from platform specific code. The
DT/ACPI can also be used to described 1), 2), 3), and 4) mentioned
above but 5) will always require platform specific code.

Regards,
Anup


Re: Proposal v2: SBI PMU Extension

Zong Li
 

On Mon, Jul 6, 2020 at 12:35 AM Anup Patel <anup.patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER
CSRs. A RISC-V implementation can support monitoring various HARDWARE
events using limited number of HPMCOUNTER CSRs.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET CSR,
HPMCOUNTER CSRs and SOFTWARE counters provided by SBI implementation.

To define SBI PMU extension, we first define counter_idx which is a
logical number assigned to a counter and event_idx which is an encoded
Is there more detail about counter_idx? I was wondering that
1. What is the ordering of logical numbers for HW and SW counters? I
think that the logical numbers are assigned by OpenSBI.
2. How to know the logical number of counter_idx of each HW and SW
counters from s-mode? I guess that we need to know the logical numbers
of all counters before we invoke a SBI call.

number representing the HARDWARE/SOFTWARE event to be monitored.

The SBI PMU event_idx is a XLEN bits wide number encoded as follows:
event_idx[XLEN-1:16] = info
event_idx[15:12] = type
event_idx[11:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_idx.info is optional and can be passed zero whereas the
event_idx.code can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_idx.info is optional and can be passed zero
whereas the event_idx.code is encoded as follows:
event_idx.code[11:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW
event, both event_idx.info and event_idx.code are platform dependent.

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event,
event_idx.info is SBI implementation specific and event_idx.code can be
one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking ABI compatibility
of SBI calls.

Using definition of counter_idx and event_idx, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs
Is it for the SW counters and we get the number of HW counters by DT?
Or does it return the number of HW and SW counters both? If so, how to
distinguish the number of HW and SW?
The other question is that the number of SW counters is defined by the
core of OpenSBI or platform-dependent?

2. SBI_PMU_COUNTER_DESCRIBE
This call takes two parameters: 1) counter_idx 2) physical address
It will write the description of SBI PMU counter at specified physical
address. The details of the SBI PMU counter written at specified
physical address are as follows:
1. Name (64 bytes)
2. CSR_Number (2 bytes)
(CSR_Number <= 0xfff means counter is a RISC-V CSR)
(CSR_Number > 0xfff means counter is a SBI implementation counter)
(E.g. CSR_Number == 0xC02 imply HPMCOUNTER2 CSR)
3. CSR_Width (2 bytes)
(Number of CSR bits implemented in HW)
4. Event_Count (2 bytes)
(Number of events in Event_List array)
5. Event_List (2 * Event_Count bytes)
(This is an array of 16bit values where each 16bit value is the
supported event_idx.type and event_idx.code combination)
What is the size we should allocate for this physical address? In my
understanding, we need to allocate
the pages in s-mode first, then pass the address of the pages to the
second parameter, but we don't know the
event_counter before we allocate the space for it, so it might across
the boundary if event_count is very big.

3. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters: 1) counter_idx 2) physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number > 0xfff).
4. SBI_PMU_COUNTER_START
This call takes two parameters: 1) counter_idx 2) event_idx
It will inform SBI implementation to configure and start/enable
specified counter on the calling HART to monitor specific event.
This SBI call will fail for counters which are not present and
specified event_idx is not supported by the counter.
5. SBI_PMU_COUNTER_STOP
This call takes one parameter: 1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

From above, the RISC-V PMU driver will use most of the SBI calls at boot
time. Only SBI_PMU_COUNTER_START to be used once before using the counter.
The reading of counter is by reading CSR (for CSR_Number < 0xfff) OR by
reading memory location (for CSR_Offset >= 0xfff). The counter overflow
handling will have to be done in software by Linux kernel.

Using the SBI PMU extension, the M-mode runtime firmware (or Hypervisors)
can provide a standardized view of HARDWARE/SOFTWARE counters and events
to S-mode (or VS-mode) software.

The M-mode runtime firmware (OpenSBI) will need to know following
platform dependent information:
1. Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. HPMCOUNTER)
2. Mapping of event_idx for HARDWARE event to HPMEVENT CSR value
3. Mapping of event_idx for HARDWARE CACHE event to HPMEVENT CSR value
4. Mapping of event_idx for HARDWARE RAW event to HPMEVENT CSR value
5. Additional platform-specific progamming required by any event_idx

All platform dependent information mentioned above, can be obtained
by M-mode runtime firmware (OpenSBI) from platform specific code. The
DT/ACPI can also be used to described 1), 2), 3), and 4) mentioned
above but 5) will always require platform specific code.
I would update the next version of DT file to describe the points from
1) to 4). Thanks.
As you mentioned before, it would be hard to sync the platform
specific code with the DT of real use.
I prefer to get 1), 2), 3) and 4) from DT first on each platform, and
use platform specific code if DT is unavailable. (generic platform use
DT certainly),
then we could maximally reduce the inconsistency.


Regards,
Anup



Re: Proposal v2: SBI PMU Extension

Anup Patel
 

-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 06 July 2020 13:59
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Andrew Waterman
<andrew@...>; Greg Favor <gfavor@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v2: SBI PMU
Extension

On Mon, Jul 6, 2020 at 12:35 AM Anup Patel <anup.patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have
HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER
CSRs. A RISC-V implementation can support monitoring various HARDWARE
events using limited number of HPMCOUNTER CSRs.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension which tries to cover CYCLE CSR, INSTRET
CSR, HPMCOUNTER CSRs and SOFTWARE counters provided by SBI
implementation.

To define SBI PMU extension, we first define counter_idx which is a
logical number assigned to a counter and event_idx which is an encoded
Is there more detail about counter_idx? I was wondering that 1. What is the
ordering of logical numbers for HW and SW counters? I think that the logical
numbers are assigned by OpenSBI.
Like mentioned here, counter_idx is a logical index for all available counters
(i.e. HARDWARE and SOFTWARE). The SBI implementation (i.e. OpenSBI,
Xvisor RISC-V, or KVM RISC-V) can assign counter_idx to HARDWARE and
SOFTWARE counters in any order it likes.

2. How to know the logical number of counter_idx of each HW and SW
counters from s-mode? I guess that we need to know the logical numbers of
all counters before we invoke a SBI call.
The SBI_PMU_COUNTER_DESCRIBE call mentioned below will tell us
whether given counter_idx maps to a HARDWARE counter or SOFTWARE
counter based on CSR_Number info returned by SBI_PMU_COUNTER_DESCRIBE
call.


number representing the HARDWARE/SOFTWARE event to be monitored.

The SBI PMU event_idx is a XLEN bits wide number encoded as follows:
event_idx[XLEN-1:16] = info
event_idx[15:12] = type
event_idx[11:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE
event, the event_idx.info is optional and can be passed zero whereas
the event_idx.code can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For
HARDWARE
CACHE event, the event_idx.info is optional and can be passed zero
whereas the event_idx.code is encoded as follows:
event_idx.code[11:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum
sbi_pmu_hw_cache_op_id
{
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, both event_idx.info and event_idx.code are platform
dependent.

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, event_idx.info is SBI implementation specific and
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking ABI
compatibility of SBI calls.

Using definition of counter_idx and event_idx, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs
Is it for the SW counters and we get the number of HW counters by DT?
Or does it return the number of HW and SW counters both? If so, how to
distinguish the number of HW and SW?
This call returns total number of counters (i.e. HARDWARE and SOFTWARE both)

The other question is that the number of SW counters is defined by the core
of OpenSBI or platform-dependent?
Number of SW counters are defined by SBI implementation (i.e. OpenSBI,
Xvisor RISC-V, and KVM RISC-V). Most likely SW counters will not include
any platform-dependent SW counters although this is design choice of
SBI implementation.


2. SBI_PMU_COUNTER_DESCRIBE
This call takes two parameters: 1) counter_idx 2) physical address
It will write the description of SBI PMU counter at specified physical
address. The details of the SBI PMU counter written at specified
physical address are as follows:
1. Name (64 bytes)
2. CSR_Number (2 bytes)
(CSR_Number <= 0xfff means counter is a RISC-V CSR)
(CSR_Number > 0xfff means counter is a SBI implementation counter)
(E.g. CSR_Number == 0xC02 imply HPMCOUNTER2 CSR)
3. CSR_Width (2 bytes)
(Number of CSR bits implemented in HW)
4. Event_Count (2 bytes)
(Number of events in Event_List array)
5. Event_List (2 * Event_Count bytes)
(This is an array of 16bit values where each 16bit value is the
supported event_idx.type and event_idx.code combination)
What is the size we should allocate for this physical address? In my
understanding, we need to allocate the pages in s-mode first, then pass the
address of the pages to the second parameter, but we don't know the
event_counter before we allocate the space for it, so it might across the
boundary if event_count is very big.
Theoretically, Event_Count cannot be more than 65535.

I think we should have SBI_PMU_NUM_EVENTS calls which will
return number of events supported by given counter_idx. This will
help S-mode software to determine amount of memory to allocate
for SBI_PMU_COUNTER_DESCRIBE.


3. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters: 1) counter_idx 2) physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number > 0xfff).
4. SBI_PMU_COUNTER_START
This call takes two parameters: 1) counter_idx 2) event_idx
It will inform SBI implementation to configure and start/enable
specified counter on the calling HART to monitor specific event.
This SBI call will fail for counters which are not present and
specified event_idx is not supported by the counter.
5. SBI_PMU_COUNTER_STOP
This call takes one parameter: 1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

From above, the RISC-V PMU driver will use most of the SBI calls at
boot time. Only SBI_PMU_COUNTER_START to be used once before using
the counter.
The reading of counter is by reading CSR (for CSR_Number < 0xfff) OR
by reading memory location (for CSR_Offset >= 0xfff). The counter
overflow handling will have to be done in software by Linux kernel.

Using the SBI PMU extension, the M-mode runtime firmware (or
Hypervisors) can provide a standardized view of HARDWARE/SOFTWARE
counters and events to S-mode (or VS-mode) software.

The M-mode runtime firmware (OpenSBI) will need to know following
platform dependent information:
1. Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. HPMCOUNTER)
2. Mapping of event_idx for HARDWARE event to HPMEVENT CSR value 3.
Mapping of event_idx for HARDWARE CACHE event to HPMEVENT CSR
value 4.
Mapping of event_idx for HARDWARE RAW event to HPMEVENT CSR value
5.
Additional platform-specific progamming required by any event_idx

All platform dependent information mentioned above, can be obtained by
M-mode runtime firmware (OpenSBI) from platform specific code. The
DT/ACPI can also be used to described 1), 2), 3), and 4) mentioned
above but 5) will always require platform specific code.
I would update the next version of DT file to describe the points from
1) to 4). Thanks.
As you mentioned before, it would be hard to sync the platform specific code
with the DT of real use.
I prefer to get 1), 2), 3) and 4) from DT first on each platform, and use
platform specific code if DT is unavailable. (generic platform use DT certainly),
then we could maximally reduce the inconsistency.
It should platform's choice on how it wants to describe HARDWARE
events and HARDWARE counters. The OpenSBI generic platform will
tend to use DT based parsing of HARDWARE events and HARDWARE
counters but other platform can do things differently.

The S-mode software (i.e. Linux) should not get HARDWARE events
and HARDWARE counters from DT because DT describes HARDWARE
and DT will not include SOFTWARE events and SOFTWARE counters.
Also, SOFTWARE events and SOFTWARE counters will change for given
platform as OpenSBI continues to improve so it will be hard to keep
the DT in sync.

The best thing for S-mode software would be to depend on one
method of discovering all counters and supported events which
is the SBI_PMU_COUNTER_DESCRIBE call. In other words, no need
for platform driver for Linux RISC-V PMU driver instead depend only
on sbi_probe_extension() to detect SBI PMU extension.

Regards,
Anup

101 - 120 of 1846