Re: Proposal v5: SBI PMU Extension


Anup Patel
 

Hi All,

Small correction required here...

The MHPMEVENT == zero means "No Event" as per RISC-V privilege spec so
event_idx.type == 0 and event_idx.code == 0 should be treated as "No Event".

Regards,
Anup

-----Original Message-----
From: tech-unixplatformspec@... <tech-
unixplatformspec@...> On Behalf Of Anup Patel
Sent: 14 August 2020 12:56
To: tech-unixplatformspec@...
Cc: Atish Patra <Atish.Patra@...>; Andrew Waterman
<andrew@...>; Greg Favor <gfavor@...>; Jonathan
Behrens <behrensj@...>
Subject: [RISC-V] [tech-unixplatformspec] Proposal v5: SBI PMU Extension

Hi All,

We don't have a dedicated RISC-V PMU extension for all privilege modes but
we do have M-mode HARDWARE performance counters such as MCYCLE
CSR, MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-
mode and U-mode. A RISC-V implementation can support monitoring of
various HARDWARE events using limited number of HARDWARE performance
counters.

In addition to HARDWARE performance counters, a SBI implementation (e.g.
OpenSBI, Xvisor, KVM, etc) can provide XLEN wide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU)
HARDWARE/SOFTWARE counters.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.

Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents
a HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.

The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
non-zero values of event_info are reserved for future use whereas the
event_idx.code can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, non-zero values of event_info are reserved for future use
whereas the event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call
(described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR.

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event,
the event_info is optional and can be zero whereas the event_idx.code can
be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking SBI call backward-
compatibility.

Using above definitions of counter_idx, event_idx, and event_info we can
potentially have following SBI calls:

1. SBI_PMU_NUM_COUNTERS
Return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
Provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in HW)
return_value[XLEN-1:20] = Reserved
If CSR_Number == 0xfff then it is SOFTWARE counter with CSR_Width is
fixed to XLEN otherwise it is HARDWARE counter with CSR_Width <= 64.
This SBI call will fail for counters which are not present.

3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
Find and configure a counter from a set of counters which can monitor
specified event. The counter_idx_base and counter_idx_mask parameters
represent the set of counters whereas the event_idx and event_info
represent the event to monitor. Upon success the SBI call will return
the counter_idx of the counter which has been configured to monitor
specified event. This SBI call will fail if it is unable to find a
counter which can monitor specified event or the set of counters
specified via counter_idx_base and counter_idx_mask has an invalid
counter.

4. SBI_PMU_COUNTER_SOFT_READ
This call takes two parameters:
1) counter_idx
Read a SOFTWARE counter value. This SBI call is only for SOFTWARE
counters (i.e. only for counters with CSR_Number == 0xfff) and it
will fail for counters which are not present.

5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.

6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

The OpenSBI (M-mode runtime firmware) Development Notes:

1. The OpenSBI firmware will translate event_idx and event_into into
platform dependent MHPMEVENT CSR value before starting/enabling a
HARDWARE counter.

2. The OpenSBI firmware will need to know following platform dependent
information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. MHPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional and by default OpenSBI will write a value
<xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of
event_info
C) Sanity check of event_info for HARDWARE RAW event. This is optional
and by default OpenSBI will blindly write event_info to MHPMEVENT
CSRs without any checks.
D) Additional platform-specific progamming required for selecting
event_idx + event_info combination is also optional for platform.

3. All platform dependent information mentioned above, can be obtained
by OpenSBI firmware from platform specific code. The DT/ACPI can
also be used to describe 2.A and 2.B mentioned above but 2.C and
2.D will always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify overflow
interrupt for each HART. When "interrupts" DT property is present,
we might also need another DT property for mapping HARTID to entries
in "interrupts" DT property. The platform driver probe will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. The driver can skip this in driver probe and instead do
this lazily in add() callback mentioned below.

2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by add() callback)

3. add() callback
The add() callback of Linux RISC-V PMU driver will find a free
counter on current CPU/HART such that the event_idx and event_info
combination is supported by the counter. To find-and-configure
a counter to monitor event_idx and event_info combination from
a set of counters, we will use the
SBI_PMU_COUNTER_CONFIG_MATCHING
call.

4. del() callback
The del() callback of Linux RISC-V PMU driver will release or
free the counter.

5. start() callback
The start() callback of Linux RISC-V PMU driver will start the
counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop the
counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup

Join {tech-unixplatformspec@lists.riscv.org to automatically receive all group messages.