
Anup Patel
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters. The SBI PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be monitored. A HARDWARE/SOFTWARE event can also have additional configuration/details referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id { SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below) will have the RAW event value to be programmed in MHPMEVENT CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs 2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW) If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present. 3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. 4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff). 5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present. 6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know following platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be obtained by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned above but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter. 5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call. 6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote: Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters. The SBI PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be monitored. A HARDWARE/SOFTWARE event can also have additional configuration/details referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id { SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below) will have the RAW event value to be programmed in MHPMEVENT CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW) If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter.
It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Apart from this question above, this version of the proposal is great to me. Thanks, Zong 4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know following platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be obtained by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned above but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|

Anup Patel
toggle quoted message
Show quoted text
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be monitored. A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT calls by looking at CSR number assigned to counter. Apart from this question above, this version of the proposal is great to me.
Cool 😊 Regards, Anup Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know following platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be obtained by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned above but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
Should there also be a way to atomically specify start/stop for a set of counters, or is the latency of N SBI start/stop calls short enough that starting or stopping N counters will not take that long? For a lot of cores today, N is very small, like 2 for some cores, but as RISC-V cores continue to grow in capability, N could easily become 4 to 8 for the core, another set in the L2, another set in the L3, etc.
Brian
toggle quoted message
Show quoted text
> -----Original Message-----
> From: Zong Li <zong.li@...>
> Sent: 14 July 2020 09:02
> To: Anup Patel <Anup.Patel@...>
> Cc: tech-unixplatformspec@...; Atish Patra
> <Atish.Patra@...>; andrew@...; gfavor@...
> Subject: Re: Proposal v3: SBI PMU Extension
>
> On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
> >
> > Hi All,
> >
> > We don't have a dedicated RISC-V PMU extension but we do have
> HARDWARE
> > performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER
> > CSRs. A RISC-V implementation can support monitoring various HARDWARE
> > events using limited number of HPMCOUNTER CSRs.
> >
> > In addition to HARDWARE performance counters, a SBI implementation
> > (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
> > events such as number of RFENCEs, number of IPIs, number of misaligned
> > load/store instructions, number of illegal instructions, etc.
> >
> > We propose SBI PMU extension, which will help S-mode (or VS-mode)
> > software to discover and configure HARDWARE/SOFTWARE counters. The
> SBI
> > PMU extension will only manage per-HART (or per-CPU)
> HARDWARE/SOFTWARE
> > counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and
> > SOFTWARE counters provided by SBI implementation.
> >
> > Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or
> > Xvisor) will provide a standardized view of HARDWARE/SOFTWARE
> counters
> > and events to S-mode (or VS-mode) software.
> >
> > To define SBI PMU extension, we first define counter_idx which is a
> > logical number assigned to a counter and event_idx which is an encoded
> > number representing the HARDWARE/SOFTWARE event to be monitored.
> A
> > HARDWARE/SOFTWARE event can also have additional
> configuration/details
> > referred to as event_info.
> >
> > The SBI PMU event_idx is a 20bits wide number encoded as follows:
> > event_idx[19:16] = type
> > event_idx[15:0] = code
> >
> > If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE
> > event, the event_info is not required whereas the event_idx.code can
> > be one of the following values:
> > enum sbi_pmu_hw_id {
> > SBI_PMU_HW_CPU_CYCLES = 0,
> > SBI_PMU_HW_INSTRUCTIONS = 1,
> > SBI_PMU_HW_CACHE_REFERENCES = 2,
> > SBI_PMU_HW_CACHE_MISSES = 3,
> > SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
> > SBI_PMU_HW_BRANCH_MISSES = 5,
> > SBI_PMU_HW_BUS_CYCLES = 6,
> > SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
> > SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
> > SBI_PMU_HW_REF_CPU_CYCLES = 9,
> > SBI_PMU_HW_MAX, /* non-ABI */
> > };
> > (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
> >
> > If event_idx.type == 0x1 then it is HARDWARE CACHE event. For
> HARDWARE
> > CACHE event, the event_info is not required whereas the event_idx.code
> > is encoded as follows:
> > event_idx.code[15:3] = cache_id
> > event_idx.code[2:1] = op_id
> > event_idx.code[0:0] = result_id
> > enum sbi_pmu_hw_cache_id {
> > SBI_PMU_HW_CACHE_L1D = 0,
> > SBI_PMU_HW_CACHE_L1I = 1,
> > SBI_PMU_HW_CACHE_LL = 2,
> > SBI_PMU_HW_CACHE_DTLB = 3,
> > SBI_PMU_HW_CACHE_ITLB = 4,
> > SBI_PMU_HW_CACHE_BPU = 5,
> > SBI_PMU_HW_CACHE_NODE = 6,
> > SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum
> sbi_pmu_hw_cache_op_id
> > {
> > SBI_PMU_HW_CACHE_OP_READ = 0,
> > SBI_PMU_HW_CACHE_OP_WRITE = 1,
> > SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
> > SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
> > };
> > enum sbi_pmu_hw_cache_op_result_id {
> > SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
> > SBI_PMU_HW_CACHE_RESULT_MISS = 1,
> > SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
> > };
> > (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
> >
> > If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
> > RAW event, the event_idx.code should be zero and the event_info
> > parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described
> below)
> > will have the RAW event value to be programmed in MHPMEVENT CSR (i.e.
> > the SBI implementation will not derive MHPMEVENT CSR value from
> > event_idx + event_info).
> >
> > If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
> > event, the event_info is not required whereas the event_idx.code can
> > be one of the following:
> > enum sbi_pmu_sw_id {
> > SBI_PMU_SW_MISALIGNED_LOAD = 0,
> > SBI_PMU_SW_MISALIGNED_STORE = 1,
> > SBI_PMU_SW_ILLEGAL_INSN = 2,
> > SBI_PMU_SW_LOCAL_SET_TIMER = 3,
> > SBI_PMU_SW_LOCAL_IPI = 4,
> > SBI_PMU_SW_LOCAL_FENCE_I = 5,
> > SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
> > SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
> > SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
> > SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
> > SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
> > SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
> > SBI_PMU_SW_MAX, /* non-ABI */
> > };
> >
> > In future, more events can be defined without breaking SBI call
> > compatibility of SBI calls.
> >
> > Using definition of counter_idx and event_idx, we can potentially have
> > the following SBI calls:
> >
> > 1. SBI_PMU_NUM_COUNTERS
> > This call will return the number of COUNTERs
> >
> > 2. SBI_PMU_COUNTER_GET_CSR
> > This call takes one parameter:
> > 1) counter_idx
> > It will provide the CSR_Number and CSR_Width of underlying counter.
> > The value returned by SBI call is encoded as follows:
> > return_value[11:0] = CSR_Number
> > return_value[19:12] = CSR_Width (Number of bits implemented in
> HW)
> > If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
> > HARDWARE counter. This SBI call will fail for counters which are not
> > present.
> >
> > 3. SBI_PMU_COUNTER_SET_EVENT
> > This call takes three parameter:
> > 1) counter_idx
> > 2) event_idx
> > 3) event_info
> > It will select an event to be monitored by given counter. If this
> > SBI call is not used for a counter to select an event then the
> > counter will montior default event selected for it at boot-time.
> > This SBI call will fail for counters which are not present. It will
> > also fail if specified event_idx + event_info combination is not
> > supported by given counter.
>
> It also seems to fail if the specified event is not supported by the given
> counter, right? Then Linux driver could try to allocate the next free counter
> when returning failure from this SBI calls.
Yes, this call will fail if event_idx + event_info combination is not supported
by given counter_idx. It is expected that Linux driver will try another
free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested
few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT calls by
looking at CSR number assigned to counter.
>
> Apart from this question above, this version of the proposal is great to me.
Cool 😊
Regards,
Anup
>
> Thanks,
> Zong
>
> >
> > 4. SBI_PMU_COUNTER_SET_PHYS_ADDR
> > This call takes two parameters:
> > 1) counter_idx
> > 2) 8byte aligned physical address
> > It will set the physical address of memory location where the SBI
> > implementation will write the 64bit SOFTWARE counter. This SBI call
> > is only for counters not mapped to any CSR (i.e. only for counters
> > with CSR_Number > 0xfff).
> >
> > 5. SBI_PMU_COUNTER_START
> > This call takes two parameters:
> > 1) counter_idx
> > 2) initial_value
> > It will inform SBI implementation to start/enable specified counter
> > with specified initial value. This SBI call will fail for counters
> > which are not present.
> >
> > 6. SBI_PMU_COUNTER_STOP
> > This call takes one parameter:
> > 1) counter_idx
> > It will inform SBI implementation to stop/disable specified counters
> > on the calling HART. This SBI call will fail for counters which are
> > not present.
> >
> > The M-mode runtime firmware (OpenSBI) Development Notes:
> >
> > 1. The M-mode runtime firmware will have to translate SBI PMU
> > event_idx and event_into into platform dependent MHPMEVENT CSR
> > value before starting/enabling a HARDWARE counter.
> >
> > 2. The M-mode runtime firmware (OpenSBI) will need to know following
> > platform dependent information:
> > A) Possible event_idx values allowed (or supported) by a HARDWARE
> > counter (i.e. HPMCOUNTER)
> > B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT
> CSR
> > value. This is optional for platform. By default, OpenSBI will
> > write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
> > are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20
> > bits of event_info
> > C) Additional platform-specific progamming required for selecting
> > event_idx + event_info combination. This is also optional for
> > platform.
> >
> > 3. All platform dependent information mentioned above, can be obtained
> > by M-mode runtime firmware (OpenSBI) from platform specific code.
> > The DT/ACPI can also be used to describe 2.A and 2.B mentioned above
> > but 2.C will always require platform specific code.
> >
> > Linux RISC-V PMU Driver Development Notes:
> >
> > 1. Driver probe
> > The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu"
> > as DT compatible string and optional "interrupts" DT property. The
> > "interrupts" DT property if available should specify an edge-triggered
> > overflow interrupt for each HART. When "interrupts" DT property is
> > present, we might also need another DT property for mapping HARTID
> > to entries in "interrupts" DT property. The platform driver probe
> > will:
> > A) Need to ensure that underlying SBI implementation provides
> > SBI PMU extension using sbi_probe_extension() API of arch/riscv.
> > B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
> > C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
> > call. If the counter is a SOFTWARE counter then use the
> > SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
> > of counter. The driver skip this in driver probe and instead
> > do this lazily in add() callback mentioned below.
> >
> > 2. event_init() callback
> > The event_init() callback will primarily translate user-space
> > perf_event_attr to SBI PMU event_idx and event_info. It can do
> > this in following way:
> > A) perf_event_attr.type == PERF_TYPE_HARDWARE
> > event_idx.type = 0x0
> > event_idx.code = Value from enum sbi_pmu_hw_id based on
> > perf_event_attr.config
> > event_info = 0
> > B) perf_event_attr.type == PERF_TYPE_HW_CACHE
> > event_idx.type = 0x1
> > event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
> > based on perf_event_attr.config
> > event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
> > based on perf_event_attr.config
> > event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
> > based on perf_event_attr.config
> > event_info = 0
> > C) perf_event_attr.type == PERF_TYPE_RAW and
> > perf_event_attr.config[63:63] == 0
> > event_idx.type = 0x2
> > event_idx.code = 0x0
> > event_info = perf_event_attr.config[62:0]
> > D) perf_event_attr.type == PERF_TYPE_RAW and
> > perf_event_attr.config[63:63] == 1
> > event_idx.type = 0xf
> > event_idx.code = Value from enum sbi_pmu_sw_id based on
> > perf_event_attr.config
> > event_info = 0
> > (Note: event_init() will fail if it is not able to figure out
> > event_idx and event_info value corresponding to perf_event_attr)
> > (Note: event_init() will not assign counter to perf_event because
> > it will be done by event_add())
> >
> > 3. add() callback
> > The add() callback of Linux RISC-V PMU driver will find a
> > free counter on current CPU/HART such that the perf_event
> > event_idx + event_info combination is supported by the counter.
> > To check-and-set event_idx + event_info combination for a
> > counter, we will use the SBI_PMU_COUNTER_SET_EVENT call.
> > The counter allocation and SBI_PMU_COUNTER_SET_EVENT call
> > can be futher optimized by looking at CSR details.
> > For example:
> > A) For event_idx.type == 0 and
> > event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should
> > prefer counter mapping to CYCLE CSR and skip doing
> > SBI_PMU_COUNTER_SET_EVENT call.
> > B) For event_idx.type == 0 and
> > event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should
> > prefer counter mapping to INSTRET CSR and skip doing
> > SBI_PMU_COUNTER_SET_EVENT call.
> > C) For event_idx == 0xf, only perfer counters mapping
> > to 0xfff CSR (i.e. SOFTWARE counters).
> >
> > 4. del() callback
> > The del() callback of Linux RISC-V PMU driver will release
> > or free the counter.
> >
> > 5. start() callback
> > The start() callback of Linux RISC-V PMU driver will start
> > the counter using the SBI_PMU_COUNTER_START call.
> >
> > 6. stop() callback
> > The stop() callback of Linux RISC-V PMU driver will stop
> > the counter using the SBI_PMU_COUNTER_STOP call.
> >
> > Regards,
> > Anup
|
|

Anup Patel
One SBI call to start/stop N counters will certainly be faster than N SBI calls.
We did not include SBI calls to start/stop a set of counters because Linux perf drivers only require mechanism to start/stop one counter.
Regards,
Anup
toggle quoted message
Show quoted text
From: Brian Grayson <brian.grayson@...>
Sent: 14 July 2020 18:58
To: Anup Patel <Anup.Patel@...>
Cc: Zong Li <zong.li@...>; tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v3: SBI PMU Extension
Should there also be a way to atomically specify start/stop for a set of counters, or is the latency of N SBI start/stop calls short enough that starting or stopping N counters will not take that long? For a lot of cores today, N is very
small, like 2 for some cores, but as RISC-V cores continue to grow in capability, N could easily become 4 to 8 for the core, another set in the L2, another set in the L3, etc.
> -----Original Message-----
> From: Zong Li <zong.li@...>
> Sent: 14 July 2020 09:02
> To: Anup Patel <Anup.Patel@...>
> Cc: tech-unixplatformspec@...; Atish Patra
> <Atish.Patra@...>;
andrew@...;
gfavor@...
> Subject: Re: Proposal v3: SBI PMU Extension
>
> On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
> >
> > Hi All,
> >
> > We don't have a dedicated RISC-V PMU extension but we do have
> HARDWARE
> > performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER
> > CSRs. A RISC-V implementation can support monitoring various HARDWARE
> > events using limited number of HPMCOUNTER CSRs.
> >
> > In addition to HARDWARE performance counters, a SBI implementation
> > (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
> > events such as number of RFENCEs, number of IPIs, number of misaligned
> > load/store instructions, number of illegal instructions, etc.
> >
> > We propose SBI PMU extension, which will help S-mode (or VS-mode)
> > software to discover and configure HARDWARE/SOFTWARE counters. The
> SBI
> > PMU extension will only manage per-HART (or per-CPU)
> HARDWARE/SOFTWARE
> > counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and
> > SOFTWARE counters provided by SBI implementation.
> >
> > Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or
> > Xvisor) will provide a standardized view of HARDWARE/SOFTWARE
> counters
> > and events to S-mode (or VS-mode) software.
> >
> > To define SBI PMU extension, we first define counter_idx which is a
> > logical number assigned to a counter and event_idx which is an encoded
> > number representing the HARDWARE/SOFTWARE event to be monitored.
> A
> > HARDWARE/SOFTWARE event can also have additional
> configuration/details
> > referred to as event_info.
> >
> > The SBI PMU event_idx is a 20bits wide number encoded as follows:
> > event_idx[19:16] = type
> > event_idx[15:0] = code
> >
> > If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE
> > event, the event_info is not required whereas the event_idx.code can
> > be one of the following values:
> > enum sbi_pmu_hw_id {
> > SBI_PMU_HW_CPU_CYCLES = 0,
> > SBI_PMU_HW_INSTRUCTIONS = 1,
> > SBI_PMU_HW_CACHE_REFERENCES = 2,
> > SBI_PMU_HW_CACHE_MISSES = 3,
> > SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
> > SBI_PMU_HW_BRANCH_MISSES = 5,
> > SBI_PMU_HW_BUS_CYCLES = 6,
> > SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
> > SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
> > SBI_PMU_HW_REF_CPU_CYCLES = 9,
> > SBI_PMU_HW_MAX, /* non-ABI */
> > };
> > (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
> >
> > If event_idx.type == 0x1 then it is HARDWARE CACHE event. For
> HARDWARE
> > CACHE event, the event_info is not required whereas the event_idx.code
> > is encoded as follows:
> > event_idx.code[15:3] = cache_id
> > event_idx.code[2:1] = op_id
> > event_idx.code[0:0] = result_id
> > enum sbi_pmu_hw_cache_id {
> > SBI_PMU_HW_CACHE_L1D = 0,
> > SBI_PMU_HW_CACHE_L1I = 1,
> > SBI_PMU_HW_CACHE_LL = 2,
> > SBI_PMU_HW_CACHE_DTLB = 3,
> > SBI_PMU_HW_CACHE_ITLB = 4,
> > SBI_PMU_HW_CACHE_BPU = 5,
> > SBI_PMU_HW_CACHE_NODE = 6,
> > SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum
> sbi_pmu_hw_cache_op_id
> > {
> > SBI_PMU_HW_CACHE_OP_READ = 0,
> > SBI_PMU_HW_CACHE_OP_WRITE = 1,
> > SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
> > SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
> > };
> > enum sbi_pmu_hw_cache_op_result_id {
> > SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
> > SBI_PMU_HW_CACHE_RESULT_MISS = 1,
> > SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
> > };
> > (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
> >
> > If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
> > RAW event, the event_idx.code should be zero and the event_info
> > parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described
> below)
> > will have the RAW event value to be programmed in MHPMEVENT CSR (i.e.
> > the SBI implementation will not derive MHPMEVENT CSR value from
> > event_idx + event_info).
> >
> > If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
> > event, the event_info is not required whereas the event_idx.code can
> > be one of the following:
> > enum sbi_pmu_sw_id {
> > SBI_PMU_SW_MISALIGNED_LOAD = 0,
> > SBI_PMU_SW_MISALIGNED_STORE = 1,
> > SBI_PMU_SW_ILLEGAL_INSN = 2,
> > SBI_PMU_SW_LOCAL_SET_TIMER = 3,
> > SBI_PMU_SW_LOCAL_IPI = 4,
> > SBI_PMU_SW_LOCAL_FENCE_I = 5,
> > SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
> > SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
> > SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
> > SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
> > SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
> > SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
> > SBI_PMU_SW_MAX, /* non-ABI */
> > };
> >
> > In future, more events can be defined without breaking SBI call
> > compatibility of SBI calls.
> >
> > Using definition of counter_idx and event_idx, we can potentially have
> > the following SBI calls:
> >
> > 1. SBI_PMU_NUM_COUNTERS
> > This call will return the number of COUNTERs
> >
> > 2. SBI_PMU_COUNTER_GET_CSR
> > This call takes one parameter:
> > 1) counter_idx
> > It will provide the CSR_Number and CSR_Width of underlying counter.
> > The value returned by SBI call is encoded as follows:
> > return_value[11:0] = CSR_Number
> > return_value[19:12] = CSR_Width (Number of bits implemented in
> HW)
> > If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
> > HARDWARE counter. This SBI call will fail for counters which are not
> > present.
> >
> > 3. SBI_PMU_COUNTER_SET_EVENT
> > This call takes three parameter:
> > 1) counter_idx
> > 2) event_idx
> > 3) event_info
> > It will select an event to be monitored by given counter. If this
> > SBI call is not used for a counter to select an event then the
> > counter will montior default event selected for it at boot-time.
> > This SBI call will fail for counters which are not present. It will
> > also fail if specified event_idx + event_info combination is not
> > supported by given counter.
>
> It also seems to fail if the specified event is not supported by the given
> counter, right? Then Linux driver could try to allocate the next free counter
> when returning failure from this SBI calls.
Yes, this call will fail if event_idx + event_info combination is not supported
by given counter_idx. It is expected that Linux driver will try another
free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested
few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT calls by
looking at CSR number assigned to counter.
>
> Apart from this question above, this version of the proposal is great to me.
Cool 😊
Regards,
Anup
>
> Thanks,
> Zong
>
> >
> > 4. SBI_PMU_COUNTER_SET_PHYS_ADDR
> > This call takes two parameters:
> > 1) counter_idx
> > 2) 8byte aligned physical address
> > It will set the physical address of memory location where the SBI
> > implementation will write the 64bit SOFTWARE counter. This SBI call
> > is only for counters not mapped to any CSR (i.e. only for counters
> > with CSR_Number > 0xfff).
> >
> > 5. SBI_PMU_COUNTER_START
> > This call takes two parameters:
> > 1) counter_idx
> > 2) initial_value
> > It will inform SBI implementation to start/enable specified counter
> > with specified initial value. This SBI call will fail for counters
> > which are not present.
> >
> > 6. SBI_PMU_COUNTER_STOP
> > This call takes one parameter:
> > 1) counter_idx
> > It will inform SBI implementation to stop/disable specified counters
> > on the calling HART. This SBI call will fail for counters which are
> > not present.
> >
> > The M-mode runtime firmware (OpenSBI) Development Notes:
> >
> > 1. The M-mode runtime firmware will have to translate SBI PMU
> > event_idx and event_into into platform dependent MHPMEVENT CSR
> > value before starting/enabling a HARDWARE counter.
> >
> > 2. The M-mode runtime firmware (OpenSBI) will need to know following
> > platform dependent information:
> > A) Possible event_idx values allowed (or supported) by a HARDWARE
> > counter (i.e. HPMCOUNTER)
> > B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT
> CSR
> > value. This is optional for platform. By default, OpenSBI will
> > write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
> > are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20
> > bits of event_info
> > C) Additional platform-specific progamming required for selecting
> > event_idx + event_info combination. This is also optional for
> > platform.
> >
> > 3. All platform dependent information mentioned above, can be obtained
> > by M-mode runtime firmware (OpenSBI) from platform specific code.
> > The DT/ACPI can also be used to describe 2.A and 2.B mentioned above
> > but 2.C will always require platform specific code.
> >
> > Linux RISC-V PMU Driver Development Notes:
> >
> > 1. Driver probe
> > The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu"
> > as DT compatible string and optional "interrupts" DT property. The
> > "interrupts" DT property if available should specify an edge-triggered
> > overflow interrupt for each HART. When "interrupts" DT property is
> > present, we might also need another DT property for mapping HARTID
> > to entries in "interrupts" DT property. The platform driver probe
> > will:
> > A) Need to ensure that underlying SBI implementation provides
> > SBI PMU extension using sbi_probe_extension() API of arch/riscv.
> > B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
> > C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
> > call. If the counter is a SOFTWARE counter then use the
> > SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
> > of counter. The driver skip this in driver probe and instead
> > do this lazily in add() callback mentioned below.
> >
> > 2. event_init() callback
> > The event_init() callback will primarily translate user-space
> > perf_event_attr to SBI PMU event_idx and event_info. It can do
> > this in following way:
> > A) perf_event_attr.type == PERF_TYPE_HARDWARE
> > event_idx.type = 0x0
> > event_idx.code = Value from enum sbi_pmu_hw_id based on
> > perf_event_attr.config
> > event_info = 0
> > B) perf_event_attr.type == PERF_TYPE_HW_CACHE
> > event_idx.type = 0x1
> > event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
> > based on perf_event_attr.config
> > event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
> > based on perf_event_attr.config
> > event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
> > based on perf_event_attr.config
> > event_info = 0
> > C) perf_event_attr.type == PERF_TYPE_RAW and
> > perf_event_attr.config[63:63] == 0
> > event_idx.type = 0x2
> > event_idx.code = 0x0
> > event_info = perf_event_attr.config[62:0]
> > D) perf_event_attr.type == PERF_TYPE_RAW and
> > perf_event_attr.config[63:63] == 1
> > event_idx.type = 0xf
> > event_idx.code = Value from enum sbi_pmu_sw_id based on
> > perf_event_attr.config
> > event_info = 0
> > (Note: event_init() will fail if it is not able to figure out
> > event_idx and event_info value corresponding to perf_event_attr)
> > (Note: event_init() will not assign counter to perf_event because
> > it will be done by event_add())
> >
> > 3. add() callback
> > The add() callback of Linux RISC-V PMU driver will find a
> > free counter on current CPU/HART such that the perf_event
> > event_idx + event_info combination is supported by the counter.
> > To check-and-set event_idx + event_info combination for a
> > counter, we will use the SBI_PMU_COUNTER_SET_EVENT call.
> > The counter allocation and SBI_PMU_COUNTER_SET_EVENT call
> > can be futher optimized by looking at CSR details.
> > For example:
> > A) For event_idx.type == 0 and
> > event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should
> > prefer counter mapping to CYCLE CSR and skip doing
> > SBI_PMU_COUNTER_SET_EVENT call.
> > B) For event_idx.type == 0 and
> > event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should
> > prefer counter mapping to INSTRET CSR and skip doing
> > SBI_PMU_COUNTER_SET_EVENT call.
> > C) For event_idx == 0xf, only perfer counters mapping
> > to 0xfff CSR (i.e. SOFTWARE counters).
> >
> > 4. del() callback
> > The del() callback of Linux RISC-V PMU driver will release
> > or free the counter.
> >
> > 5. start() callback
> > The start() callback of Linux RISC-V PMU driver will start
> > the counter using the SBI_PMU_COUNTER_START call.
> >
> > 6. stop() callback
> > The stop() callback of Linux RISC-V PMU driver will stop
> > the counter using the SBI_PMU_COUNTER_STOP call.
> >
> > Regards,
> > Anup
|
|
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be monitored. A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Apart from this question above, this version of the proposal is great to me. Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know following platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be obtained by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned above but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|

Anup Patel
toggle quoted message
Show quoted text
-----Original Message----- From: Zong Li <zong.li@...> Sent: 15 July 2020 07:40 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: Proposal v3: SBI PMU Extension
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER
CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and
SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e.
the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Yes, this is feasible and can further reduce SBI calls but if we go this route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate. How about this ?? 3. SBI_PMU_COUNTER_CONFIG_MATCHING This call takes three parameter: 1) counter_idx_base 2) counter_idx_mask 3) event_idx 4) event_info It will find and configure a counter from a set of counters which can monitor specified event. The counter_idx_base and counter_idx_mask parameters represent the set of counters whereas the event_idx and event_info represent the event to monitor. Upon success the SBI call will return the counter_idx of the counter which has been configured to monitor specified event. This SBI call will fail if it is unable to find a counter which can monitor specified event. It will also fail if the set of counters pointers specified via counter_idx_base and counter_idx_mask has an invalid counter.
Apart from this question above, this version of the proposal is great to
me.
Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
Why not have
SBI_PMU_COUNTER_SET_EVENT
return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)? Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change. The caller can simply call it again with a different counter_idx that it knows should succeed.
Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event? Then no bit mask ever needs to be returned.
Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes? (And any returned bit mask would also reflect only counters that have their mcounteren bits set to '1'.)
Greg
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
|
|
Anup,
What is the plan with regards to scounteren and hcounteren? Is the caller (whether an OS or a hypervisor) supposed to take into account the relevant *counteren CSR's when specifying counter_idx in the call to SBI_PMU_COUNTER_SET_EVENT? And the M-mode SBI_PMU_COUNTER_SET_EVENT routine only worries about mcounteren?
It seems like this is necessary since the M-mode SBI_PMU_COUNTER_SET_EVENT routine won't know which privilege mode was the original requester (and hence which other *counteren CSR's are relevant).
Greg
toggle quoted message
Show quoted text
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
Why not have
SBI_PMU_COUNTER_SET_EVENT
return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)? Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change. The caller can simply call it again with a different counter_idx that it knows should succeed.
Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event? Then no bit mask ever needs to be returned.
Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes? (And any returned bit mask would also reflect only counters that have their mcounteren bits set to '1'.)
Greg
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
|
|
On Wed, Jul 15, 2020 at 12:32 PM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 15 July 2020 07:40 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: Proposal v3: SBI PMU Extension
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER
CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs, number of IPIs, number of misaligned load/store instructions, number of illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and
SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE event to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e.
the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying counter. The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI calls. Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Yes, this is feasible and can further reduce SBI calls but if we go this route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING This call takes three parameter: 1) counter_idx_base 2) counter_idx_mask 3) event_idx 4) event_info It will find and configure a counter from a set of counters which can monitor specified event. The counter_idx_base and counter_idx_mask parameters represent the set of counters whereas the event_idx and event_info represent the event to monitor. Upon success the SBI call will return the counter_idx of the counter which has been configured to monitor specified event. This SBI call will fail if it is unable to find a counter which can monitor specified event. It will also fail if the set of counters pointers specified via counter_idx_base and counter_idx_mask has an invalid counter.
It looks like the m-mode firmware would be responsible for selecting a suitable counter and taking the allocation work for s-mode software? Could you elaborate or give a example to show how counter_idx_base and counter_idx_mask represent the set of counters? It seems likes the set of counters can be represented by one parameter, each bit is corresponding to one counter_idx.
Apart from this question above, this version of the proposal is great to
me.
Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI call is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified counters on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT CSR value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information: A) Possible event_idx values allowed (or supported) by a HARDWARE counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific code. The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-triggered overflow interrupt for each HART. When "interrupts" DT property is present, we might also need another DT property for mapping HARTID to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of arch/riscv. B) Detect number of counters using SBI_PMU_NUM_COUNTERS call C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to perf_event_attr) (Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|

Anup Patel
Hi Greg,
The SBI PMU extension provider for HS-mode is M-mode runtime firmware (OpenSBI) and for VS-mode the provider is HS-mode (Hypervisor).
We will enable HARDWARE counters in HCOUNTEREN CSR when hypervisor receives SBI_PMU_COUNTER_START call from VS-mode.
The standard interface between Linux user-space and Linux kernel-space is perf SYCALLs/IOCTLs. Despite this, we can support apps who want direct HARDWARE CSR access by enabling HARDWARE counter in SCOUNTEREN CSR. The Linux RISC-V PMU driver
can do this in the start() callback.
Regards,
Anup
toggle quoted message
Show quoted text
From: tech-unixplatformspec@... <tech-unixplatformspec@...>
On Behalf Of Greg Favor
Sent: 15 July 2020 11:08
To: Greg Favor <gfavor@...>
Cc: Anup Patel <Anup.Patel@...>; Zong Li <zong.li@...>; tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v3: SBI PMU Extension
Anup,
What is the plan with regards to scounteren and hcounteren? Is the caller (whether an OS or a hypervisor) supposed to take into account the relevant *counteren CSR's when specifying counter_idx in the call to SBI_PMU_COUNTER_SET_EVENT?
And the M-mode SBI_PMU_COUNTER_SET_EVENT routine only worries about mcounteren?
It seems like this is necessary since the M-mode SBI_PMU_COUNTER_SET_EVENT routine won't know which privilege mode was the original requester (and hence which other *counteren CSR's are relevant).
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
Why not have SBI_PMU_COUNTER_SET_EVENT return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)? Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change.
The caller can simply call it again with a different counter_idx that it knows should succeed.
Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event? Then no bit mask ever needs to
be returned.
Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes? (And any returned bit mask would also reflect only counters
that have their mcounteren bits set to '1'.)
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
|
|

Anup Patel
toggle quoted message
Show quoted text
-----Original Message----- From: tech-unixplatformspec@... <tech- unixplatformspec@...> On Behalf Of Zong Li Sent: 15 July 2020 13:36 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@... Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v3: SBI PMU Extension
On Wed, Jul 15, 2020 at 12:32 PM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 15 July 2020 07:40 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:
-----Original Message----- From: Zong Li <zong.li@...> Sent: 14 July 2020 09:02 To: Anup Patel <Anup.Patel@...> Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension
On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
We don't have a dedicated RISC-V PMU extension but we do have HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER CSRs. A RISC-V implementation can support monitoring various HARDWARE events using limited number of HPMCOUNTER
CSRs.
In addition to HARDWARE performance counters, a SBI implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for events such as number of RFENCEs,
number
of IPIs, number of misaligned load/store instructions, number of
illegal instructions, etc.
We propose SBI PMU extension, which will help S-mode (or VS-mode) software to discover and configure
HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER
CSRs
and
SOFTWARE counters provided by SBI implementation.
Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor) will provide a standardized view of HARDWARE/SOFTWARE counters
and events to S-mode (or VS-mode) software.
To define SBI PMU extension, we first define counter_idx which is a logical number assigned to a counter and event_idx which is an encoded number representing the HARDWARE/SOFTWARE
event
to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional configuration/details
referred to as event_info.
The SBI PMU event_idx is a 20bits wide number encoded as follows: event_idx[19:16] = type event_idx[15:0] = code
If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event, the event_info is not required whereas the event_idx.code can be one of the following values: enum sbi_pmu_hw_id { SBI_PMU_HW_CPU_CYCLES = 0, SBI_PMU_HW_INSTRUCTIONS = 1, SBI_PMU_HW_CACHE_REFERENCES = 2, SBI_PMU_HW_CACHE_MISSES = 3, SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4, SBI_PMU_HW_BRANCH_MISSES = 5, SBI_PMU_HW_BUS_CYCLES = 6, SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7, SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8, SBI_PMU_HW_REF_CPU_CYCLES = 9, SBI_PMU_HW_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code is encoded as follows: event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id { SBI_PMU_HW_CACHE_L1D = 0, SBI_PMU_HW_CACHE_L1I = 1, SBI_PMU_HW_CACHE_LL = 2, SBI_PMU_HW_CACHE_DTLB = 3, SBI_PMU_HW_CACHE_ITLB = 4, SBI_PMU_HW_CACHE_BPU = 5, SBI_PMU_HW_CACHE_NODE = 6, SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_id
{ SBI_PMU_HW_CACHE_OP_READ = 0, SBI_PMU_HW_CACHE_OP_WRITE = 1, SBI_PMU_HW_CACHE_OP_PREFETCH = 2, SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */ }; enum sbi_pmu_hw_cache_op_result_id { SBI_PMU_HW_CACHE_RESULT_ACCESS = 0, SBI_PMU_HW_CACHE_RESULT_MISS = 1, SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */ }; (NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)
If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE RAW event, the event_idx.code should be zero and the event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT
call
(described below)
will have the RAW event value to be programmed in MHPMEVENT CSR
(i.e.
the SBI implementation will not derive MHPMEVENT CSR value from event_idx + event_info).
If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE event, the event_info is not required whereas the event_idx.code can be one of the following: enum sbi_pmu_sw_id { SBI_PMU_SW_MISALIGNED_LOAD = 0, SBI_PMU_SW_MISALIGNED_STORE = 1, SBI_PMU_SW_ILLEGAL_INSN = 2, SBI_PMU_SW_LOCAL_SET_TIMER = 3, SBI_PMU_SW_LOCAL_IPI = 4, SBI_PMU_SW_LOCAL_FENCE_I = 5, SBI_PMU_SW_LOCAL_SFENCE_VMA = 6, SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7, SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8, SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9, SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10, SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11, SBI_PMU_SW_MAX, /* non-ABI */ };
In future, more events can be defined without breaking SBI call compatibility of SBI calls.
Using definition of counter_idx and event_idx, we can potentially have the following SBI calls:
1. SBI_PMU_NUM_COUNTERS This call will return the number of COUNTERs
2. SBI_PMU_COUNTER_GET_CSR This call takes one parameter: 1) counter_idx It will provide the CSR_Number and CSR_Width of underlying
counter.
The value returned by SBI call is encoded as follows: return_value[11:0] = CSR_Number return_value[19:12] = CSR_Width (Number of bits implemented in HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it
is
HARDWARE counter. This SBI call will fail for counters which are not present.
3. SBI_PMU_COUNTER_SET_EVENT This call takes three parameter: 1) counter_idx 2) event_idx 3) event_info It will select an event to be monitored by given counter. If this SBI call is not used for a counter to select an event then the counter will montior default event selected for it at boot-time. This SBI call will fail for counters which are not present. It will also fail if specified event_idx + event_info combination is not supported by given counter. It also seems to fail if the specified event is not supported by the given counter, right? Then Linux driver could try to allocate the next free counter when returning failure from this SBI
calls.
Yes, this call will fail if event_idx + event_info combination is not supported by given counter_idx. It is expected that Linux driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into ret.value ($a1) if it fails for the given counter, then s-mode software can conveniently find which next one is a good counter for the event. But there is a constraint that the maximum number of counters need to assume to be less than XLEN. Do you think it is feasible? Yes, this is feasible and can further reduce SBI calls but if we go this route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.
How about this ??
3. SBI_PMU_COUNTER_CONFIG_MATCHING This call takes three parameter: 1) counter_idx_base 2) counter_idx_mask 3) event_idx 4) event_info It will find and configure a counter from a set of counters which can monitor specified event. The counter_idx_base and counter_idx_mask parameters represent the set of counters whereas the event_idx and event_info represent the event to monitor. Upon success the SBI call will return the counter_idx of the counter which has been configured to monitor specified event. This SBI call will fail if it is unable to find a counter which can monitor specified event. It will also fail if the set of counters pointers specified via counter_idx_base and counter_idx_mask has an invalid counter.
It looks like the m-mode firmware would be responsible for selecting a suitable counter and taking the allocation work for s-mode software? The allocation work of counters will still be owned by S-mode software. Only selecting a counter from a set of available counters based on event to be monitored will be done by SBI_PMU_COUNTER_CONFIG_MATCHING. In fact, SBI_PMU_COUNTER_CONFIG_MATCHING is equivalent to SBI_PMU_COUNTER_SET_EVENT if we pass "counter_idx_mask = 0x1" and "counter_idx_base = counter_idx" Could you elaborate or give a example to show how counter_idx_base and counter_idx_mask represent the set of counters? It seems likes the set of Let's say Linux RISC-V PMU driver is tracking free/available counters for each HART using a per-HART bitmap. Now let's assume that on HART X we have free counters 3,6,8,9,13,20, ... and some user-space app creates perf_event on HART X. In this case, with SBI_PMU_COUNTER_SET_EVENT call we will have to try each available counter one-by-one to find counter that supports required event_idx + event_info combination in add() callback of Linux RISC-V PMU driver. Instead of this, using SBI_PMU_COUNTER_CONFIG_MATCHING call we let SBI implementation select a matching counter from a set of available counters. counters can be represented by one parameter, each bit is corresponding to one counter_idx. We have to consider both RV32 and RV64 here. On RV32, XLEN = 32 so having only one parameter will limit us to 32 counters. This is too strict for RV32 and we don't have any room for SOFTWARE counters. Let's not limit number of counters by XLEN. The counter_idx_base and counter_idx_mask approach is similar to "hart_mask_base" and "hart_mask" parameters of SBI_SEND_IPI call. Regards, Anup
Apart from this question above, this version of the proposal is great to
me.
Cool
Regards, Anup
Thanks, Zong
4. SBI_PMU_COUNTER_SET_PHYS_ADDR This call takes two parameters: 1) counter_idx 2) 8byte aligned physical address It will set the physical address of memory location where the SBI implementation will write the 64bit SOFTWARE counter. This SBI
call
is only for counters not mapped to any CSR (i.e. only for counters with CSR_Number > 0xfff).
5. SBI_PMU_COUNTER_START This call takes two parameters: 1) counter_idx 2) initial_value It will inform SBI implementation to start/enable specified counter with specified initial value. This SBI call will fail for counters which are not present.
6. SBI_PMU_COUNTER_STOP This call takes one parameter: 1) counter_idx It will inform SBI implementation to stop/disable specified
counters
on the calling HART. This SBI call will fail for counters which are not present.
The M-mode runtime firmware (OpenSBI) Development Notes:
1. The M-mode runtime firmware will have to translate SBI PMU event_idx and event_into into platform dependent MHPMEVENT
CSR
value before starting/enabling a HARDWARE counter.
2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information: A) Possible event_idx values allowed (or supported) by a
HARDWARE
counter (i.e. HPMCOUNTER) B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-
20
bits of event_info C) Additional platform-specific progamming required for selecting event_idx + event_info combination. This is also optional for platform.
3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific
code.
The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.
Linux RISC-V PMU Driver Development Notes:
1. Driver probe The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu" as DT compatible string and optional "interrupts" DT property. The "interrupts" DT property if available should specify an edge-
triggered
overflow interrupt for each HART. When "interrupts" DT property
is
present, we might also need another DT property for mapping
HARTID
to entries in "interrupts" DT property. The platform driver probe will: A) Need to ensure that underlying SBI implementation provides SBI PMU extension using sbi_probe_extension() API of
arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS
call
C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory
location
of counter. The driver skip this in driver probe and instead do this lazily in add() callback mentioned below.
2. event_init() callback The event_init() callback will primarily translate user-space perf_event_attr to SBI PMU event_idx and event_info. It can do this in following way: A) perf_event_attr.type == PERF_TYPE_HARDWARE event_idx.type = 0x0 event_idx.code = Value from enum sbi_pmu_hw_id based on perf_event_attr.config event_info = 0 B) perf_event_attr.type == PERF_TYPE_HW_CACHE event_idx.type = 0x1 event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id based on perf_event_attr.config event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config event_info = 0 C) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 0 event_idx.type = 0x2 event_idx.code = 0x0 event_info = perf_event_attr.config[62:0] D) perf_event_attr.type == PERF_TYPE_RAW and perf_event_attr.config[63:63] == 1 event_idx.type = 0xf event_idx.code = Value from enum sbi_pmu_sw_id based on perf_event_attr.config event_info = 0 (Note: event_init() will fail if it is not able to figure out event_idx and event_info value corresponding to
perf_event_attr)
(Note: event_init() will not assign counter to perf_event because it will be done by event_add())
3. add() callback The add() callback of Linux RISC-V PMU driver will find a free counter on current CPU/HART such that the perf_event event_idx + event_info combination is supported by the counter. To check-and-set event_idx + event_info combination for a counter, we will use the SBI_PMU_COUNTER_SET_EVENT call. The counter allocation and SBI_PMU_COUNTER_SET_EVENT call can be futher optimized by looking at CSR details. For example: A) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should prefer counter mapping to CYCLE CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. B) For event_idx.type == 0 and event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should prefer counter mapping to INSTRET CSR and skip doing SBI_PMU_COUNTER_SET_EVENT call. C) For event_idx == 0xf, only perfer counters mapping to 0xfff CSR (i.e. SOFTWARE counters).
4. del() callback The del() callback of Linux RISC-V PMU driver will release or free the counter.
5. start() callback The start() callback of Linux RISC-V PMU driver will start the counter using the SBI_PMU_COUNTER_START call.
6. stop() callback The stop() callback of Linux RISC-V PMU driver will stop the counter using the SBI_PMU_COUNTER_STOP call.
Regards, Anup
|
|