Re: Proposal v3: SBI PMU Extension
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be monitored. A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Apart from this question above, this version of the proposal is great to me. Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know following platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be obtained by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned above but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
Re: Proposal v3: SBI PMU Extension

Anup Patel
toggle quoted messageShow quoted text
-----Original Message----- From: Zong Li <zong.li@...> Sent: 15 July 2020 07:40 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: Proposal v3: SBI PMU Extension
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER
CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and
SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e.
the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Yes, this is feasible and can further reduce SBI calls but if we go this route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate. How about this ?? 3. SBI_PMU_COUNTER_CONFIG_MATCHING This call takes three parameter: 1) counter_idx_base 2) counter_idx_mask 3) event_idx 4) event_info It will find and configure a counter from a set of counters which can monitor specified event. The counter_idx_base and counter_idx_mask parameters represent the set of counters whereas the event_idx and event_info represent the event to monitor. Upon success the SBI call will return the counter_idx of the counter which has been configured to monitor specified event. This SBI call will fail if it is unable to find a counter which can monitor specified event. It will also fail if the set of counters pointers specified via counter_idx_base and counter_idx_mask has an invalid counter.
Apart from this question above, this version of the proposal is great to
me.
Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
Re: Proposal v3: SBI PMU Extension
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
Why not have
SBI_PMU_COUNTER_SET_EVENT
return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)? Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change. The caller can simply call it again with a different counter_idx that it knows should succeed.
Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event? Then no bit mask ever needs to be returned.
Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes? (And any returned bit mask would also reflect only counters that have their mcounteren bits set to '1'.)
Greg
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
|
|
Re: Proposal v3: SBI PMU Extension
Anup,
What is the plan with regards to scounteren and hcounteren? Is the caller (whether an OS or a hypervisor) supposed to take into account the relevant *counteren CSR's when specifying counter_idx in the call to SBI_PMU_COUNTER_SET_EVENT? And the M-mode SBI_PMU_COUNTER_SET_EVENT routine only worries about mcounteren?
It seems like this is necessary since the M-mode SBI_PMU_COUNTER_SET_EVENT routine won't know which privilege mode was the original requester (and hence which other *counteren CSR's are relevant).
Greg
toggle quoted messageShow quoted text
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
Why not have
SBI_PMU_COUNTER_SET_EVENT
return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)? Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change. The caller can simply call it again with a different counter_idx that it knows should succeed.
Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event? Then no bit mask ever needs to be returned.
Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes? (And any returned bit mask would also reflect only counters that have their mcounteren bits set to '1'.)
Greg
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
|
|
Re: Proposal v3: SBI PMU Extension
On Wed, Jul 15, 2020 at 12:32 PM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 15 July 2020 07:40 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: Proposal v3: SBI PMU Extension
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER
CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and
SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e.
the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Yes, this is feasible and can further reduce SBI calls but if we go this route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING This call takes three parameter: 1) counter_idx_base 2) counter_idx_mask 3) event_idx 4) event_info It will find and configure a counter from a set of counters which can monitor specified event. The counter_idx_base and counter_idx_mask parameters represent the set of counters whereas the event_idx and event_info represent the event to monitor. Upon success the SBI call will return the counter_idx of the counter which has been configured to monitor specified event. This SBI call will fail if it is unable to find a counter which can monitor specified event. It will also fail if the set of counters pointers specified via counter_idx_base and counter_idx_mask has an invalid counter.
It looks like the m-mode firmware would be responsible for selecting a suitable counter and taking the allocation work for s-mode software? Could you elaborate or give a example to show how counter_idx_base and counter_idx_mask represent the set of counters? It seems likes the set of counters can be represented by one parameter, each bit is corresponding to one counter_idx.
Apart from this question above, this version of the proposal is great to
me.
Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
Re: Proposal v3: SBI PMU Extension

Anup Patel
Hi Greg,
The SBI PMU extension provider for HS-mode is M-mode runtime firmware (OpenSBI) and for VS-mode the provider is HS-mode (Hypervisor).
We will enable HARDWARE counters in HCOUNTEREN CSR when hypervisor receives SBI_PMU_COUNTER_START call from VS-mode.
The standard interface between Linux user-space and Linux kernel-space is perf SYCALLs/IOCTLs. Despite this, we can support apps who want direct HARDWARE CSR access by enabling HARDWARE counter in SCOUNTEREN CSR. The Linux RISC-V PMU driver
can do this in the start() callback.
Regards,
Anup
From: tech-unixplatformspec@... <tech-unixplatformspec@...>
On Behalf Of Greg Favor
Sent: 15 July 2020 11:08
To: Greg Favor <gfavor@...>
Cc: Anup Patel <Anup.Patel@...>; Zong Li <zong.li@...>; tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v3: SBI PMU Extension
Anup,
What is the plan with regards to scounteren and hcounteren? Is the caller (whether an OS or a hypervisor) supposed to take into account the relevant *counteren CSR's when specifying counter_idx in the call to SBI_PMU_COUNTER_SET_EVENT?
And the M-mode SBI_PMU_COUNTER_SET_EVENT routine only worries about mcounteren?
It seems like this is necessary since the M-mode SBI_PMU_COUNTER_SET_EVENT routine won't know which privilege mode was the original requester (and hence which other *counteren CSR's are relevant).
toggle quoted messageShow quoted text
On Tue, Jul 14, 2020 at 10:31 PM Greg Favor via
lists.riscv.org <gfavor= ventanamicro.com@...> wrote:
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
Why not have SBI_PMU_COUNTER_SET_EVENT return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)? Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change.
The caller can simply call it again with a different counter_idx that it knows should succeed.
Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event? Then no bit mask ever needs to
be returned.
Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes? (And any returned bit mask would also reflect only counters
that have their mcounteren bits set to '1'.)
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
|
|
Re: Proposal v3: SBI PMU Extension

Anup Patel
toggle quoted messageShow quoted text
-----Original Message----- From: tech-unixplatformspec@... <tech- unixplatformspec@...> On Behalf Of Zong Li Sent: 15 July 2020 13:36 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v3: SBI PMU Extension
On Wed, Jul 15, 2020 at 12:32 PM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 15 July 2020 07:40 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER
CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs,
number
of IPIs, number of misaligned load/store instructions, number of
illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure
HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER
CSRs
and
SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE
event
to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT
call
(described below)
will have the RAW event value to be programmed in MHPMEVENT CSR
(i.e.
the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying
counter.
The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it
is
HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI
calls.
Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Yes, this is feasible and can further reduce SBI calls but if we go this route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING This call takes three parameter: 1) counter_idx_base 2) counter_idx_mask 3) event_idx 4) event_info It will find and configure a counter from a set of counters which can monitor specified event. The counter_idx_base and counter_idx_mask parameters represent the set of counters whereas the event_idx and event_info represent the event to monitor. Upon success the SBI call will return the counter_idx of the counter which has been configured to monitor specified event. This SBI call will fail if it is unable to find a counter which can monitor specified event. It will also fail if the set of counters pointers specified via counter_idx_base and counter_idx_mask has an invalid counter.
It looks like the m-mode firmware would be responsible for selecting a suitable counter and taking the allocation work for s-mode software? The allocation work of counters will still be owned by S-mode software. Only selecting a counter from a set of available counters based on event to be monitored will be done by SBI_PMU_COUNTER_CONFIG_MATCHING. In fact, SBI_PMU_COUNTER_CONFIG_MATCHING is equivalent to SBI_PMU_COUNTER_SET_EVENT if we pass "counter_idx_mask = 0x1" and "counter_idx_base = counter_idx" Could you elaborate or give a example to show how counter_idx_base and counter_idx_mask represent the set of counters? It seems likes the set of Let's say Linux RISC-V PMU driver is tracking free/available counters for each HART using a per-HART bitmap. Now let's assume that on HART X we have free counters 3,6,8,9,13,20, ... and some user-space app creates perf_event on HART X. In this case, with SBI_PMU_COUNTER_SET_EVENT call we will have to try each available counter one-by-one to find counter that supports required event_idx + event_info combination in add() callback of Linux RISC-V PMU driver. Instead of this, using SBI_PMU_COUNTER_CONFIG_MATCHING call we let SBI implementation select a matching counter from a set of available counters. counters can be represented by one parameter, each bit is corresponding to one counter_idx. We have to consider both RV32 and RV64 here. On RV32, XLEN = 32 so having only one parameter will limit us to 32 counters. This is too strict for RV32 and we don't have any room for SOFTWARE counters. Let's not limit number of counters by XLEN. The counter_idx_base and counter_idx_mask approach is similar to "hart_mask_base" and "hart_mask" parameters of SBI_SEND_IPI call. Regards, Anup
Apart from this question above, this version of the proposal is great to
me.
Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI
call
is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified
counters
on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT
CSR
value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information: A) Possible event_idx values allowed (or supported) by a
HARDWARE
counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-
20
bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific
code.
The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-
triggered
overflow interrupt for each HART. When "interrupts" DT property
is
present, we might also need another DT property for mapping
HARTID
to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of
arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS
call
C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory
location
of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to
perf_event_attr)
(Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
Re: [RISC-V] [tech-config] Profiles and Config and Device Tree
[Adding Software and Platform]
I agree the Profiles TG will need to work closely with the Config and Platform TGs.
Mark, I think it might be helpful if you (and maybe Krste also) could create a short description of the high-level goals/use cases for Profiles for a future TSC/TC discussion. Profiles cut across many of the TGs, and I think many are still
hazy on what’s involved. I expect we will have an easier time recruiting a Profiles TG Chair if the goals are clearly defined.
Best,
Arun
From: <tech-config@...> on behalf of mark <markhimelstein@...>
Reply-To: "tech-config@..." <tech-config@...>
Date: Wednesday, July 29, 2020 at 10:40 PM
To: config <tech-config@...>, RISC-V SW Dev <sw-dev@...>, Al Stone <ahs3@...>
Subject: [RISC-V] [tech-config] Profiles and Config and Device Tree
I feel like we could be more coordinated in our efforts here.
From what I know Profiles will overlap in content with Config and with Device Tree (and friends -- using Device Tree to represent the class of configuration files that drive how we build SW from OS, Tool Chain,
etc.).
Both Config and Device Tree must be consistent with the appropriate Profile. Profiles will have some information that Config won't need. Profiles will also have some information that Device Tree won't need. Both
Config and Device Tree will contain information that does not have to be in a Profile.
All constituents must have representation in the Profiles TG. The Profiles TG will own driving Profiles to closure.
This may be obvious to some of you but it was not to me from the email chains.
Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to this email and immediately
destroy all copies of this email.
|
|
Re: [RISC-V] [tech-config] Profiles and Config and Device Tree

mark
I suggest the charter include but not be limited to:
1. define the syntactic format for the content that is human readable. including things like macros etc. 2. define the globally unique record & field (or equivalent) definitions and semantics. this must include rules for org specific field and a clearinghouse for field and record names (or equivalent). 3. define the organization unique naming scheme for profiles themselves (e.g. RISC-V_identifier/URL/profile_name/version_number). also consider if a RISC-V_identifier (like "RISC-V_PROFILE") is part of the string and whether it has branding or trademark implications (with Kim the marketing director's help). 4. define upwards and backwards compatibility rules for different versions of a profile. for example you may end up saying that later profile versions can only be additive for a profile. 5. define the relationship between profiles and other descriptive items in the RISC-V ecosystem including but not limited to: config, ABI and other *BIs, OS config files like device tree, unpriv spec chapter 27 RV strings. provide guidance on how to generate profile relevant portions of those other descriptions from a profile (whether it is manual or generated). 6. define the name and structure of a profile file or files. for structure an example, is it one big file or can it be a bunch of small files that might allow an umbrella file to include them )even nested) for manageability and to allow multiple profiles to share common portions even across orgs. is it tool compatible with something that will turn it into one big file like cpp? can it be built? will it be built every night and be made available? if so, will all versions of a the big file be made available in combined form somewhere? 7. define where profiles are stored and accessible: github? riscv.org website? multiple places? something different? are all orgs required to post their profiles or just RISC-V. org specific profiles may affect the profile file names so that the profile file URLs has some relationship to the URL (if it is included) from the profile name above. 8. define the copyright requirements. 9. define the rules for organizations other than RISC-V to create, potentially hide, publish, and maintain their own profiles potentially including the definition of the types of profiles they can create. for examples of types: they could have a derived profile meaning it only adds things to a riscv.org profile or they may create a custom profile which changes field values from some base riscv.org profile 10. specify the relationship between profiles and app or os compatibility. for example if you adopt and comply with the riscv.org linux profile or create a derived profile from it (see sample definition of derived above) then you are guaranteed that all linux app (and libs) will work on your product. one ring to bind them instead of having to say all the other identifiers list above (see compliance item below). 11. what does profile compliance or compatible mean? what is the relationship between, for example, RV or *BI compliance and profile compliance. 12. create the riscv.org base profiles: e.g. linux, rtos, bare metal. make sure they can manually or automatically generate the portions of the other descriptions (e.g. device tree) as listed above.
I am sure I missed many things and got some things wrong and I suggest we need Krste to weigh in. I derived the above from my discussions from him and my own thoughts, but it is his concept.
I also expect the TG will correct and add many things.
Is this what you were looking for?
Mark
toggle quoted messageShow quoted text
On Thu, Jul 30, 2020 at 5:47 AM < arun@...> wrote:
[Adding Software and Platform]
I agree the Profiles TG will need to work closely with the Config and Platform TGs.
Mark, I think it might be helpful if you (and maybe Krste also) could create a short description of the high-level goals/use cases for Profiles for a future TSC/TC discussion. Profiles cut across many of the TGs, and I think many are still
hazy on what’s involved. I expect we will have an easier time recruiting a Profiles TG Chair if the goals are clearly defined.
Best,
Arun
I feel like we could be more coordinated in our efforts here.
From what I know Profiles will overlap in content with Config and with Device Tree (and friends -- using Device Tree to represent the class of configuration files that drive how we build SW from OS, Tool Chain,
etc.).
Both Config and Device Tree must be consistent with the appropriate Profile. Profiles will have some information that Config won't need. Profiles will also have some information that Device Tree won't need. Both
Config and Device Tree will contain information that does not have to be in a Profile.
All constituents must have representation in the Profiles TG. The Profiles TG will own driving Profiles to closure.
This may be obvious to some of you but it was not to me from the email chains.
Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to this email and immediately
destroy all copies of this email.
-- Mark I Himelstein CTO RISC-V International +1-408-250-6611 twitter @mark_riscv
|
|
Re: [RISC-V] [software] [RISC-V] [tech-config] Profiles and Config and Device Tree
Thanks, Mark. This is very helpful.
Krste, it would be good to get your take on Profiles as well.
Arun
From: <software@...> on behalf of mark <markhimelstein@...>
Date: Thursday, July 30, 2020 at 10:23 AM
To: "tech-config@..." <tech-config@...>
Cc: Al Stone <ahs3@...>, "software@..." <software@...>, "tech-unixplatformspec@..." <tech-unixplatformspec@...>
Subject: Re: [RISC-V] [software] [RISC-V] [tech-config] Profiles and Config and Device Tree
I suggest the charter include but not be limited to:
1. define the syntactic format for the content that is human readable. including things like macros etc.
2. define the globally unique record & field (or equivalent) definitions and semantics. this must include rules for org specific field and a clearinghouse for field and record names (or equivalent).
3. define the organization unique naming scheme for profiles themselves (e.g. RISC-V_identifier/URL/profile_name/version_number). also consider if a RISC-V_identifier (like "RISC-V_PROFILE") is part of the string and whether it has branding
or trademark implications (with Kim the marketing director's help).
4. define upwards and backwards compatibility rules for different versions of a profile. for example you may end up saying that later profile versions can only be additive for a profile.
5. define the relationship between profiles and other descriptive items in the RISC-V ecosystem including but not limited to: config, ABI and other *BIs, OS config files like device tree, unpriv spec chapter 27 RV strings. provide guidance
on how to generate profile relevant portions of those other descriptions from a profile (whether it is manual or generated).
6. define the name and structure of a profile file or files. for structure an example, is it one big file or can it be a bunch of small files that might allow an umbrella file to include them )even nested) for manageability and to allow
multiple profiles to share common portions even across orgs. is it tool compatible with something that will turn it into one big file like cpp? can it be built? will it be built every night and be made available? if so, will all versions of a the big file
be made available in combined form somewhere?
7. define where profiles are stored and accessible: github?
riscv.org website? multiple places? something different? are all orgs required to post their profiles or just RISC-V. org specific profiles may affect the profile file names so that the profile file URLs has some relationship to the URL (if it is included)
from the profile name above.
8. define the copyright requirements.
9. define the rules for organizations other than RISC-V to create, potentially hide, publish, and maintain their own profiles potentially including the definition of the types of profiles they can create. for examples of types: they could
have a derived profile meaning it only adds things to a
riscv.org profile or they may create a custom profile which changes field values from some base
riscv.org profile
10. specify the relationship between profiles and app or os compatibility. for example if you adopt and comply with the
riscv.org linux profile or create a derived profile from it (see sample definition of derived above) then you are guaranteed that all linux app (and libs) will work on your product. one ring to bind them instead of having to say
all the other identifiers list above (see compliance item below).
11. what does profile compliance or compatible mean? what is the relationship between, for example, RV or *BI compliance and profile compliance.
12. create the riscv.org base profiles: e.g. linux, rtos, bare metal. make sure they can manually or automatically generate the portions of the other descriptions (e.g. device tree) as listed above.
I am sure I missed many things and got some things wrong and I suggest we need Krste to weigh in. I derived the above from my discussions from him and my own thoughts, but it is his concept.
I also expect the TG will correct and add many things.
Is this what you were looking for?
toggle quoted messageShow quoted text
On Thu, Jul 30, 2020 at 5:47 AM < arun@...> wrote:
[Adding Software and Platform]
I agree the Profiles TG will need to work closely with the Config and Platform TGs.
Mark, I think it might be helpful if you (and maybe Krste also) could create a short description of the high-level goals/use cases for Profiles for a future TSC/TC discussion. Profiles
cut across many of the TGs, and I think many are still hazy on what’s involved. I expect we will have an easier time recruiting a Profiles TG Chair if the goals are clearly defined.
Best,
Arun
I feel like we could be more coordinated in our efforts here.
From what I know Profiles will overlap in content with Config and with Device Tree (and friends -- using Device Tree to represent the class of configuration files
that drive how we build SW from OS, Tool Chain, etc.).
Both Config and Device Tree must be consistent with the appropriate Profile. Profiles will have some information that Config won't need. Profiles will
also have some information that Device Tree won't need. Both Config and Device Tree will contain information that does not have to be in a Profile.
All constituents must have representation in the Profiles TG. The Profiles TG will own driving Profiles to closure.
This may be obvious to some of you but it was not to me from the email chains.
Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to
this email and immediately destroy all copies of this email.
--
Mark I Himelstein
CTO RISC-V International
+1-408-250-6611
twitter @mark_riscv
Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to this email and immediately
destroy all copies of this email.
|
|

Palmer Dabbelt
I'm resigning from my posts at the RISC-V Foundation (vice chair of the software standing committee and chair of the UNIX platform specification working group). I know it might seem a bit sudden, but I've though about this for a while and this just isn't my sort of thing -- I ended up in these positions as a result of working on the open source software side of things, and I think I just fit in better over there. With any luck this should let me spend more time on the software side of things, so long term I think it will be better for everyone.
We don't have any pressing business in either of the groups, so I'm just going to skip the lame duck period by making this effective immediately.
|
|

mark
We are sad to see you go and you are always welcome back.
I know I speak for everyone in thanking you for all you have contributed to RISC-V SW SC and Unux Platform TG. I know we have only worked together for a short period of time but I am grateful for your honesty and knowledgeable advice.
Warmest regards and good luck in what you are pursuing.
Thank you again, Mark
toggle quoted messageShow quoted text
On Tue, Aug 4, 2020 at 2:50 PM Palmer Dabbelt < palmer@...> wrote: I'm resigning from my posts at the RISC-V Foundation (vice chair of the
software standing committee and chair of the UNIX platform specification
working group). I know it might seem a bit sudden, but I've though about this
for a while and this just isn't my sort of thing -- I ended up in these
positions as a result of working on the open source software side of things,
and I think I just fit in better over there. With any luck this should let me
spend more time on the software side of things, so long term I think it will be
better for everyone.
We don't have any pressing business in either of the groups, so I'm just going
to skip the lame duck period by making this effective immediately.
-- Mark I Himelstein CTO RISC-V International +1-408-250-6611 twitter @mark_riscv
|
|
Re: [RISC-V] [TSC] [RISC-V] [tech-unixplatformspec] I'm Resigning
Stephano Cetola <scetola@...>
Indeed, it was great working with you, and you will be missed.
Thank you for all your hard work and know you’re always welcome in the RISC-V community!
All the best, Stephano -- Stephano Cetola Program Manager, The Linux Foundation
toggle quoted messageShow quoted text
On Aug 4, 2020, at 6:33 PM, mark <markhimelstein@...> wrote:
Palmer,
We are sad to see you go and you are always welcome back.
I know I speak for everyone in thanking you for all you have contributed to RISC-V SW SC and Unux Platform TG. I know we have only worked together for a short period of time but I am grateful for your honesty and knowledgeable advice.
Warmest regards and good luck in what you are pursuing.
Thank you again, Mark
On Tue, Aug 4, 2020 at 2:50 PM Palmer Dabbelt <palmer@...> wrote: I'm resigning from my posts at the RISC-V Foundation (vice chair of the software standing committee and chair of the UNIX platform specification working group). I know it might seem a bit sudden, but I've though about this for a while and this just isn't my sort of thing -- I ended up in these positions as a result of working on the open source software side of things, and I think I just fit in better over there. With any luck this should let me spend more time on the software side of things, so long term I think it will be better for everyone.
We don't have any pressing business in either of the groups, so I'm just going to skip the lame duck period by making this effective immediately.
-- Mark I Himelstein CTO RISC-V International +1-408-250-6611 twitter @mark_riscv
|
|
Proposal v4: SBI PMU Extension

Anup Patel
Hi All,
We don't have a dedicated RISC-V PMU extension for all privilege modes but we do have M-mode HARDWARE performance counters such as MCYCLE CSR, MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and U-mode. A RISC-V implementation can support monitoring of various HARDWARE events using limited number of HARDWARE performance counters.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters. The SBI PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE counters.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters and events to S-mode (or VS-mode) software.
Before defining SBI PMU extension calls, we first define counter_idx, event_idx, and event_info entities. The counter_idx is a logical number assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a HARDWARE/SOFTWARE event whereas event_info represents additional configuration/parameters for the event.
The event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is optional and can be zero whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE CACHE event, the event_info is optional and can be zero whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id { SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described below) will have the RAW event value to be programmed in MHPMEVENT CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value from event_idx and event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is optional and can be zero whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call backward-compatibility.
Using above definitions of counter_idx, event_idx, and event_info we can potentially have following SBI calls:
1. SBI_PMU_NUM_COUNTERS Return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx Provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW) return_value[XLEN-1:20] = Reserved If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_CONFIG_MATCHING This call takes three parameter: 1) counter_idx_base 2) counter_idx_mask 3) event_idx 4) event_info Find and configure a counter from a set of counters which can monitor specified event. The counter_idx_base and counter_idx_mask parameters represent the set of counters whereas the event_idx and event_info represent the event to monitor. Upon success the SBI call will return the counter_idx of the counter which has been configured to monitor specified event. This SBI call will fail if it is unable to find a counter which can monitor specified event or the set of counters specified via counter_idx_base and counter_idx_mask has an invalid counter.
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number == 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present. 6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The OpenSBI (M-mode runtime firmware) Development Notes:
1. The OpenSBI firmware will translate event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The OpenSBI firmware will need to know following platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. MHPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR value. This is optional and by default OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific programming required for selecting event_idx + event_info combination is also optional for platform.
3. All platform dependent information mentioned above, can be obtained by OpenSBI firmware from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned above but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver can skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by add() callback)
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the event_idx and event_info combination is supported by the counter. To find-and-configure a counter to monitor event_idx and event_info combination from a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING call.
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter. 5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call. 6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
Re: Proposal v4: SBI PMU Extension
Anup,
What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?
Greg
toggle quoted messageShow quoted text
Hi All,
We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.
In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.
Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.
The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};
In future, more events can be defined without breaking SBI call
backward-compatibility.
Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:
1. SBI_PMU_NUM_COUNTERS
Return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
Provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in HW)
return_value[XLEN-1:20] = Reserved
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
Find and configure a counter from a set of counters which can monitor
specified event. The counter_idx_base and counter_idx_mask parameters
represent the set of counters whereas the event_idx and event_info
represent the event to monitor. Upon success the SBI call will return
the counter_idx of the counter which has been configured to monitor
specified event. This SBI call will fail if it is unable to find a
counter which can monitor specified event or the set of counters
specified via counter_idx_base and counter_idx_mask has an invalid
counter.
4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number == 0xfff).
5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.
6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.
The OpenSBI (M-mode runtime firmware) Development Notes:
1. The OpenSBI firmware will translate event_idx and event_into into
platform dependent MHPMEVENT CSR value before starting/enabling a
HARDWARE counter.
2. The OpenSBI firmware will need to know following platform dependent
information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. MHPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional and by default OpenSBI will write a value
<xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
C) Additional platform-specific programming required for selecting
event_idx + event_info combination is also optional for platform.
3. All platform dependent information mentioned above, can be obtained
by OpenSBI firmware from platform specific code. The DT/ACPI can
also be used to describe 2.A and 2.B mentioned above but 2.C will
always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify overflow
interrupt for each HART. When "interrupts" DT property is present,
we might also need another DT property for mapping HARTID to entries
in "interrupts" DT property. The platform driver probe will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver can skip this in driver probe and
instead do this lazily in add() callback mentioned below.
2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by add() callback)
3. add() callback
The add() callback of Linux RISC-V PMU driver will find a free
counter on current CPU/HART such that the event_idx and event_info
combination is supported by the counter. To find-and-configure
a counter to monitor event_idx and event_info combination from
a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
call.
4. del() callback
The del() callback of Linux RISC-V PMU driver will release or
free the counter.
5. start() callback
The start() callback of Linux RISC-V PMU driver will start the
counter using the SBI_PMU_COUNTER_START call.
6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop the
counter using the SBI_PMU_COUNTER_STOP call.
Regards,
Anup
|
|
Re: Proposal v4: SBI PMU Extension
Jonathan Behrens <behrensj@...>
I like this proposal! A couple comments...
In a couple places you say "the event_info is optional and can be zero". Does this mean that SBI providers must ignore the field, or that non-zero values are reserved (meaning software must set it to zero), or that other values have SBI implementation specific semantics? Or something else?
The OpenSBI firmware will need to know following platform dependent information: [...]
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR value. This is optional and by default OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
This seems to contradict the previous point. By saying how OpenSBI is going to use the event_info field, you are effectively requiring that the OS properly set it. I'd rather the SBI provider just return an error if it can't figure out the proper mappings to MHPMEVENT CSR values. That way there is no risk that the S-mode software accidentally ends up tracking the wrong performance counter because it didn't know what to set event_info to. Put another way, the SBI provider is supposed to be the one that deals with platform specific issues, so the operating system doesn't have to.
It would also be nice if possible to pin down how S-mode software can learn the meanings of raw counters. Is it sufficient to look at the tuple of (mvendorid, marchid, mimpid)? Even just some commentary text with guidance could be helpful here.
Finally, I wanted to ask about the SBI_PMU_COUNTER_SET_PHYS_ADDR function. Apologies if this has been answered already, but I think this might not work well with the enhanced PMP proposal that is designed to allow most of DRAM to be marked as S/U-mode only. The proposal allows regions to be shared between M-mode and S/U-mode but presumably an implementation would prefer to require only a single shared region with all counters instead of needing to use NUM_COUNTERS number of PMP entries. This could be enabled by making the interface be SBI_PMU_COUNTER_GET_PHYS_ADDR so the firmware gets to pick the locations. On this front, another thing to watch is the memory attributes proposals coming out of the virtual memory task group: shared mappings might have performance costs (to avoid issues with mismatches between M-mode and S-mode attributes).
Jonathan Hi All,
We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.
In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.
Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.
The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};
In future, more events can be defined without breaking SBI call
backward-compatibility.
Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:
1. SBI_PMU_NUM_COUNTERS
Return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
Provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in HW)
return_value[XLEN-1:20] = Reserved
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
Find and configure a counter from a set of counters which can monitor
specified event. The counter_idx_base and counter_idx_mask parameters
represent the set of counters whereas the event_idx and event_info
represent the event to monitor. Upon success the SBI call will return
the counter_idx of the counter which has been configured to monitor
specified event. This SBI call will fail if it is unable to find a
counter which can monitor specified event or the set of counters
specified via counter_idx_base and counter_idx_mask has an invalid
counter.
4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number == 0xfff).
5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.
6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.
The OpenSBI (M-mode runtime firmware) Development Notes:
1. The OpenSBI firmware will translate event_idx and event_into into
platform dependent MHPMEVENT CSR value before starting/enabling a
HARDWARE counter.
2. The OpenSBI firmware will need to know following platform dependent
information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. MHPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional and by default OpenSBI will write a value
<xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
C) Additional platform-specific programming required for selecting
event_idx + event_info combination is also optional for platform.
3. All platform dependent information mentioned above, can be obtained
by OpenSBI firmware from platform specific code. The DT/ACPI can
also be used to describe 2.A and 2.B mentioned above but 2.C will
always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify overflow
interrupt for each HART. When "interrupts" DT property is present,
we might also need another DT property for mapping HARTID to entries
in "interrupts" DT property. The platform driver probe will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver can skip this in driver probe and
instead do this lazily in add() callback mentioned below.
2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by add() callback)
3. add() callback
The add() callback of Linux RISC-V PMU driver will find a free
counter on current CPU/HART such that the event_idx and event_info
combination is supported by the counter. To find-and-configure
a counter to monitor event_idx and event_info combination from
a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
call.
4. del() callback
The del() callback of Linux RISC-V PMU driver will release or
free the counter.
5. start() callback
The start() callback of Linux RISC-V PMU driver will start the
counter using the SBI_PMU_COUNTER_START call.
6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop the
counter using the SBI_PMU_COUNTER_STOP call.
Regards,
Anup
|
|
Re: Proposal v4: SBI PMU Extension
On Thu, Aug 6, 2020 at 11:36 AM Jonathan Behrens < behrensj@...> wrote: It would also be nice if possible to pin down how S-mode software can learn the meanings of raw counters. Is it sufficient to look at the tuple of (mvendorid, marchid, mimpid)? Even just some commentary text with guidance could be helpful here.
But by definition RAW events are all implementation-specific (unless or until an arch extension standardizes a set of RAW events and their encodings). At best, the software discovery method that the tech-config TG has started developing, could maybe be used to provide this implementation-specific information. (Or unstandardized code could do its own discovery based on looking at things like mvendorid/marchid/mimpid.) Finally, I wanted to ask about the SBI_PMU_COUNTER_SET_PHYS_ADDR function. Apologies if this has been answered already, but I think this might not work well with the enhanced PMP proposal that is designed to allow most of DRAM to be marked as S/U-mode only.
Interesting point. Requiring M mode to have access to most S/U mode memory would defeat a lot of the purpose and security benefits of Enhanced PMP (aka PMPv2). The proposal allows regions to be shared between M-mode and S/U-mode but presumably an implementation would prefer to require only a single shared region with all counters instead of needing to use NUM_COUNTERS number of PMP entries. This could be enabled by making the interface be SBI_PMU_COUNTER_GET_PHYS_ADDR so the firmware gets to pick the locations.
A key question is, in any case, who allocates the memory where software counters are placed? How is that memory allocated in coordination with the OS or hypervisor?
Shouldn't the OS or hypervisor do the allocation and then tell M-mode the address of that block of memory? Then M-mode can allocate space for individual counters from that. And, in the context of PMPv2, that block of memory would be allocated from an existing "shared" PMP region. On this front, another thing to watch is the memory attributes proposals coming out of the virtual memory task group: shared mappings might have performance costs (to avoid issues with mismatches between M-mode and S-mode attributes).
All the more reason that the OS/hypervisor should be allocating the block of memory for software counters. The
OS/hypervisor will be aware of the memory attributes set up in the page tables and can make sure to use appropriate attribute settings in the PTEs that map this memory.
Greg
|
|
Re: Proposal v4: SBI PMU Extension

Anup Patel
Hi Greg,
The SBI_PMU_NUM_COUNTERS call will return total number of counters (HARDWARE as well as SOFTWARE).
We can distinguish between HARDWARE and SOFTWARE counters using the CSR_Number returned by SBI_PMU_COUNTER_GET_CSR call.
(Note: CSR_Number = 0xfff means it is SOFTWARE counter)
Regards,
Anup
From: Greg Favor <gfavor@...>
Sent: 07 August 2020 00:01
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; Andrew Waterman <andrew@...>
Subject: Re: Proposal v4: SBI PMU Extension
Anup,
What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?
toggle quoted messageShow quoted text
On Thu, Aug 6, 2020 at 6:52 AM Anup Patel < Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.
In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.
Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.
The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};
In future, more events can be defined without breaking SBI call
backward-compatibility.
Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:
1. SBI_PMU_NUM_COUNTERS
Return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
Provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in HW)
return_value[XLEN-1:20] = Reserved
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
Find and configure a counter from a set of counters which can monitor
specified event. The counter_idx_base and counter_idx_mask parameters
represent the set of counters whereas the event_idx and event_info
represent the event to monitor. Upon success the SBI call will return
the counter_idx of the counter which has been configured to monitor
specified event. This SBI call will fail if it is unable to find a
counter which can monitor specified event or the set of counters
specified via counter_idx_base and counter_idx_mask has an invalid
counter.
4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number == 0xfff).
5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.
6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.
The OpenSBI (M-mode runtime firmware) Development Notes:
1. The OpenSBI firmware will translate event_idx and event_into into
platform dependent MHPMEVENT CSR value before starting/enabling a
HARDWARE counter.
2. The OpenSBI firmware will need to know following platform dependent
information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. MHPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional and by default OpenSBI will write a value
<xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
C) Additional platform-specific programming required for selecting
event_idx + event_info combination is also optional for platform.
3. All platform dependent information mentioned above, can be obtained
by OpenSBI firmware from platform specific code. The DT/ACPI can
also be used to describe 2.A and 2.B mentioned above but 2.C will
always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify overflow
interrupt for each HART. When "interrupts" DT property is present,
we might also need another DT property for mapping HARTID to entries
in "interrupts" DT property. The platform driver probe will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver can skip this in driver probe and
instead do this lazily in add() callback mentioned below.
2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by add() callback)
3. add() callback
The add() callback of Linux RISC-V PMU driver will find a free
counter on current CPU/HART such that the event_idx and event_info
combination is supported by the counter. To find-and-configure
a counter to monitor event_idx and event_info combination from
a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
call.
4. del() callback
The del() callback of Linux RISC-V PMU driver will release or
free the counter.
5. start() callback
The start() callback of Linux RISC-V PMU driver will start the
counter using the SBI_PMU_COUNTER_START call.
6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop the
counter using the SBI_PMU_COUNTER_STOP call.
Regards,
Anup
|
|
Re: Proposal v4: SBI PMU Extension
Anup,
Wouldn't software want to more easily and directly know from
SBI_PMU_NUM_COUNTERS
how many hardware counters and software counters there are, instead of having to then call SBI_PMU_COUNTER_GET_CSR N times to figure that out?
Or are you expecting that software is going to have to call SBI_PMU_COUNTER_GET_CSR N times in any case (after first calling SBI_PMU_NUM_COUNTERS to get 'N')?
Greg
P.S. I'm guessing that the valid/supported counters are the first N value of counter_idx from 0 to N-1. Yes?
toggle quoted messageShow quoted text
Hi Greg,
The SBI_PMU_NUM_COUNTERS call will return total number of counters (HARDWARE as well as SOFTWARE).
We can distinguish between HARDWARE and SOFTWARE counters using the CSR_Number returned by SBI_PMU_COUNTER_GET_CSR call.
(Note: CSR_Number = 0xfff means it is SOFTWARE counter)
Regards,
Anup
Anup,
What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?
Hi All,
We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.
In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.
Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.
The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};
In future, more events can be defined without breaking SBI call
backward-compatibility.
Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:
1. SBI_PMU_NUM_COUNTERS
Return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
Provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in HW)
return_value[XLEN-1:20] = Reserved
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
Find and configure a counter from a set of counters which can monitor
specified event. The counter_idx_base and counter_idx_mask parameters
represent the set of counters whereas the event_idx and event_info
represent the event to monitor. Upon success the SBI call will return
the counter_idx of the counter which has been configured to monitor
specified event. This SBI call will fail if it is unable to find a
counter which can monitor specified event or the set of counters
specified via counter_idx_base and counter_idx_mask has an invalid
counter.
4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number == 0xfff).
5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.
6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.
The OpenSBI (M-mode runtime firmware) Development Notes:
1. The OpenSBI firmware will translate event_idx and event_into into
platform dependent MHPMEVENT CSR value before starting/enabling a
HARDWARE counter.
2. The OpenSBI firmware will need to know following platform dependent
information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. MHPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional and by default OpenSBI will write a value
<xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
C) Additional platform-specific programming required for selecting
event_idx + event_info combination is also optional for platform.
3. All platform dependent information mentioned above, can be obtained
by OpenSBI firmware from platform specific code. The DT/ACPI can
also be used to describe 2.A and 2.B mentioned above but 2.C will
always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify overflow
interrupt for each HART. When "interrupts" DT property is present,
we might also need another DT property for mapping HARTID to entries
in "interrupts" DT property. The platform driver probe will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver can skip this in driver probe and
instead do this lazily in add() callback mentioned below.
2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by add() callback)
3. add() callback
The add() callback of Linux RISC-V PMU driver will find a free
counter on current CPU/HART such that the event_idx and event_info
combination is supported by the counter. To find-and-configure
a counter to monitor event_idx and event_info combination from
a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
call.
4. del() callback
The del() callback of Linux RISC-V PMU driver will release or
free the counter.
5. start() callback
The start() callback of Linux RISC-V PMU driver will start the
counter using the SBI_PMU_COUNTER_START call.
6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop the
counter using the SBI_PMU_COUNTER_STOP call.
Regards,
Anup
|
|
Re: Proposal v4: SBI PMU Extension

Anup Patel
Hi Greg,
We want to allow RISC-V implementation specific CSRs (apart from various HPMCOUNTER CSRs) as HARDWARE counters. Due to this reason, we have decoupled counter_idx from CSR number and we treat counter_idx as logical number assigned to HARDWARE/SOFTWARE
counter.
The S-mode software needs to call SBI_PMU_COUNTER_GET_CSR only once for each counter which can be done at boot-time OR lazily once before using the counter.
Regards,
Anup
From: Greg Favor <gfavor@...>
Sent: 07 August 2020 11:13
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; Andrew Waterman <andrew@...>
Subject: Re: Proposal v4: SBI PMU Extension
Wouldn't software want to more easily and directly know from SBI_PMU_NUM_COUNTERS how many hardware counters and software counters there are, instead of having to then call SBI_PMU_COUNTER_GET_CSR N times to figure that out?
Or are you expecting that software is going to have to call SBI_PMU_COUNTER_GET_CSR N times in any case (after first calling SBI_PMU_NUM_COUNTERS to get 'N')?
P.S. I'm guessing that the valid/supported counters are the first N value of counter_idx from 0 to N-1. Yes?
toggle quoted messageShow quoted text
On Thu, Aug 6, 2020 at 10:33 PM Anup Patel < Anup.Patel@...> wrote:
Hi Greg,
The SBI_PMU_NUM_COUNTERS call will return total number of counters (HARDWARE as well as SOFTWARE).
We can distinguish between HARDWARE and SOFTWARE counters using the CSR_Number returned by SBI_PMU_COUNTER_GET_CSR call.
(Note: CSR_Number = 0xfff means it is SOFTWARE counter)
Regards,
Anup
Anup,
What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?
Hi All,
We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.
In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.
Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.
The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};
In future, more events can be defined without breaking SBI call
backward-compatibility.
Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:
1. SBI_PMU_NUM_COUNTERS
Return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
Provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in HW)
return_value[XLEN-1:20] = Reserved
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
Find and configure a counter from a set of counters which can monitor
specified event. The counter_idx_base and counter_idx_mask parameters
represent the set of counters whereas the event_idx and event_info
represent the event to monitor. Upon success the SBI call will return
the counter_idx of the counter which has been configured to monitor
specified event. This SBI call will fail if it is unable to find a
counter which can monitor specified event or the set of counters
specified via counter_idx_base and counter_idx_mask has an invalid
counter.
4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number == 0xfff).
5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.
6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.
The OpenSBI (M-mode runtime firmware) Development Notes:
1. The OpenSBI firmware will translate event_idx and event_into into
platform dependent MHPMEVENT CSR value before starting/enabling a
HARDWARE counter.
2. The OpenSBI firmware will need to know following platform dependent
information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. MHPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional and by default OpenSBI will write a value
<xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
C) Additional platform-specific programming required for selecting
event_idx + event_info combination is also optional for platform.
3. All platform dependent information mentioned above, can be obtained
by OpenSBI firmware from platform specific code. The DT/ACPI can
also be used to describe 2.A and 2.B mentioned above but 2.C will
always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify overflow
interrupt for each HART. When "interrupts" DT property is present,
we might also need another DT property for mapping HARTID to entries
in "interrupts" DT property. The platform driver probe will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver can skip this in driver probe and
instead do this lazily in add() callback mentioned below.
2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by add() callback)
3. add() callback
The add() callback of Linux RISC-V PMU driver will find a free
counter on current CPU/HART such that the event_idx and event_info
combination is supported by the counter. To find-and-configure
a counter to monitor event_idx and event_info combination from
a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
call.
4. del() callback
The del() callback of Linux RISC-V PMU driver will release or
free the counter.
5. start() callback
The start() callback of Linux RISC-V PMU driver will start the
counter using the SBI_PMU_COUNTER_START call.
6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop the
counter using the SBI_PMU_COUNTER_STOP call.
Regards,
Anup
|
|