[PATCH v6] Add performance monitoring unit extension


Anup Patel
 

This patch adds SBI performance monitoring unit (PMU) extension which
allows S-mode (or VS-mode) software to configure hardware (or firmware)
performance counters with help of M-mode (or HS-mode) software.

Signed-off-by: Anup Patel <anup.patel@...>
Signed-off-by: Atish Patra <atish.patra@...>
---
Changes since v5:
- Improved NOTEs on SBI_PMU_HW_CPU_CYCLES, SBI_PMU_HW_REF_CPU_CYCLES,
and SBI_PMU_HW_BUS_CYCLES events as suggested by Greg Favor
- Re-ordered paramters of sbi_pmu_counter_config_matching() so that
uint64_t parameter is last one
- Added NOTEs for config_flags[3:7] bits which will be used for event
filtering based on privilege mode
- Allow sbi_pmu_counter_start() and sbi_pmu_counter_stop() to work on
a set of counters
- Re-ordered function numbering to make sbi_pmu_counter_fw_read() as
last function
Changes since v4:
- Simplified "NOTE" section for RAW events
- Resized tables
- Added "SBI version" column to function list table
- Added "NOTE" section for SBI_PMU_HW_BUS_CYCLES and SBI_PMU_HW_REF_CPU_CYCLE
- Explicity mention that event_data is 64 bits wide
- Only lower 56 bits of event_data are used for RAW events to align with
upcoming Sscof extension
Changes since v3:
- The new "sscof" extension requires the event info to be 64 bit.
- Improves:
1. the counter start/stop function ids with a appropriate error codes
2. An additional flag parameter to accomodate "sscof extension".
- Renames:
a. software counter to firmware counter to avoid ambiguity with
kernel software events.
b. event_info to event_data to avoid ambiguity with counter info
Changes since v2:
- Rebased on makefile and cleanup changes
---
riscv-sbi.adoc | 446 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 446 insertions(+)

diff --git a/riscv-sbi.adoc b/riscv-sbi.adoc
index 16b7380..95dcc00 100644
--- a/riscv-sbi.adoc
+++ b/riscv-sbi.adoc
@@ -34,6 +34,7 @@ https://creativecommons.org/licenses/by/4.0/.
* Improved SBI introduction secion
* Improved documentation of SBI hart state managment extension
* Added suspend function to SBI hart state managment extension
+* Added performance monitoring unit extension

=== Version 0.2

@@ -114,6 +115,8 @@ error codes.
| SBI_ERR_DENIED | -4
| SBI_ERR_INVALID_ADDRESS | -5
| SBI_ERR_ALREADY_AVAILABLE | -6
+| SBI_ERR_ALREADY_STARTED | -7
+| SBI_ERR_ALREADY_STOPPED | -8
|===

An `ECALL` with an unsupported SBI extension ID (*EID*) or an unsupported SBI
@@ -1092,6 +1095,449 @@ The possible error codes returned in `sbiret.error` are shown in the
| sbi_system_reset | 0.3 | 0 | 0x53525354
|===

+== Performance Monitoring Unit Extension (EID #0x504D55 "PMU")
+
+The RISC-V hardware performance counters such as `mcycle`, `minstret`, and
+`mhpmcounterX` CSRs are accessible as read-only from supervisor-mode using
+`cycle`, `instret`, and `hpmcounterX` CSRs. The SBI performance monitoring
+unit (PMU) extension is an interface for supervisor-mode to configure and
+use the RISC-V hardware performance counters with assistance from the
+machine-mode (or hypervisor-mode). These hardware performance counters
+can only be started, stopped, or configured from machine-mode using
+`mcountinhibit` and `mhpmeventX` CSRs. Due to this, a machine-mode SBI
+implementation may choose to disallow SBI PMU extension if `mcountinhibit`
+CSR is not implemented by the RISC-V platform.
+
+A RISC-V platform generally supports monitoring of various hardware events
+using a limited number of hardware performance counters which are up to
+64 bits wide. In addition, a SBI implementation can also provide firmware
+performance counters which can monitor firmware events such as number of
+misaligned load/store instructions, number of RFENCEs, number of IPIs, etc.
+The firmware counters are always 64 bits wide.
+
+The SBI PMU extension provides:
+
+1. An interface for supervisor-mode software to discover and configure
+ per-HART hardware/firmware counters
+2. A typical https://en.wikipedia.org/wiki/Perf_(Linux)[perf] compatible
+ interface for hardware/firmware performance counters and events
+3. Full access to microarchitecture's raw event encodings
+
+To define SBI PMU extension calls, we first define important entities
+`counter_idx`, `event_idx`, and `event_data`. The `counter_idx` is a
+logical number assigned to each hardware/firmware counter. The `event_idx`
+represents a hardware (or firmware) event whereas the `event_data` is
+64 bits wide and represents additional configuration (or parameters) for
+a hardware (or firmware) event.
+
+The event_idx is a 20 bits wide number encoded as follows:
+[source, C]
+----
+ event_idx[19:16] = type
+ event_idx[15:0] = code
+----
+
+=== Event: Hardware general events (Type #0)
+
+The `event_idx.type` (i.e. *event type*) should be `0x0` for all hardware
+general events and each hardware general event is identified by an unique
+`event_idx.code` (i.e. *event code*) described in the
+<<table_pmu_hardware_events>> below.
+
+[#table_pmu_hardware_events]
+.PMU Hardware Events
+[cols="6,1,4", width=95%, align="center", options="header"]
+|===
+| General Event Name | Code | Description
+| SBI_PMU_HW_NO_EVENT | 0 | Unused event because
+ `event_idx` cannot be zero
+| SBI_PMU_HW_CPU_CYCLES | 1 | Event for each CPU cycle
+| SBI_PMU_HW_INSTRUCTIONS | 2 | Event for each completed
+ instruction
+| SBI_PMU_HW_CACHE_REFERENCES | 3 | Event for cache hit
+| SBI_PMU_HW_CACHE_MISSES | 4 | Event for cache miss
+| SBI_PMU_HW_BRANCH_INSTRUCTIONS | 5 | Event for a branch instruction
+| SBI_PMU_HW_BRANCH_MISSES | 6 | Event for a branch misprediction
+| SBI_PMU_HW_BUS_CYCLES | 7 | Event for each BUS cycle
+| SBI_PMU_HW_STALLED_CYCLES_FRONTEND | 8 | Event for a stalled cycle in
+ microarchitecture frontend
+| SBI_PMU_HW_STALLED_CYCLES_BACKEND | 9 | Event for a stalled cycle in
+ microarchitecture backend
+| SBI_PMU_HW_REF_CPU_CYCLES | 10 | Event for each reference
+ CPU cycle
+|===
+
+*NOTE:* The `event_data` (i.e. *event data*) is unused for hardware
+general events and all non-zero values of `event_data` are reserved
+for future use.
+
+*NOTE:* The *SBI_PMU_HW_CPU_CYCLES* event counts CPU clock cycles as
+counted by the `cycle` CSR. These may be variable frequency cycles, and
+are not counted when the CPU clock is halted (i.e. WAIT state entered
+using WFI instruction or platform specific SUSPEND state entered using
+the SBI HSM HART suspend call).
+
+*NOTE:* The *SBI_PMU_HW_REF_CPU_CYCLES* counts fixed-frequency clock
+cycles while the CPU clock is not halted (i.e. WAIT state entered using
+WFI instruction or platform specific SUSPEND state entered SBI HSM HART
+suspend call). The fixed-frequency of counting might, for example, be
+the same frequency at which the `mtime` CSR counts.
+
+*NOTE:* The *SBI_PMU_HW_BUS_CYCLES* counts fixed-frequency clock cycles.
+The fixed-frequency of counting might be the same frequency at which the
+`mtime` CSR counts, or may be the frequency of the clock at the boundary
+between the HART (and it's private caches) and the rest of the system.
+
+=== Event: Hardware cache events (Type #1)
+
+The `event_idx.type` (i.e. *event type*) should be `0x1` for all hardware
+cache events and each hardware cache event is identified by an unique
+`event_idx.code` (i.e. *event code*) which is encoded as follows:
+
+[source, C]
+----
+ event_idx.code[15:3] = cache_id
+ event_idx.code[2:1] = op_id
+ event_idx.code[0:0] = result_id
+----
+
+Below tables show possible values of: `event_idx.code.cache_id` (i.e.
+*cache event id*), `event_idx.code.op_id` (i.e. *cache operation id*)
+and `event_idx.code.result_id` (i.e. *cache result id*).
+
+[#table_pmu_cache_event_id]
+.PMU Cache Event ID
+[cols="6,2,4", width=95%, align="center", options="header"]
+|===
+| Cache Event Name | Event ID | Description
+| SBI_PMU_HW_CACHE_L1D | 0 | Level1 data cache event
+| SBI_PMU_HW_CACHE_L1I | 1 | Level1 instruction cache event
+| SBI_PMU_HW_CACHE_LL | 2 | Last level cache event
+| SBI_PMU_HW_CACHE_DTLB | 3 | Data TLB event
+| SBI_PMU_HW_CACHE_ITLB | 4 | Instruction TLB event
+| SBI_PMU_HW_CACHE_BPU | 5 | Branch predictor unit event
+| SBI_PMU_HW_CACHE_NODE | 6 | NUMA node cache event
+|===
+
+[#table_pmu_cache_ops_id]
+.PMU Cache Operation ID
+[cols="6,2,4", width=95%, align="center", options="header"]
+|===
+| Cache Operation Name | Operation ID | Description
+| SBI_PMU_HW_CACHE_OP_READ | 0 | Read cache line
+| SBI_PMU_HW_CACHE_OP_WRITE | 1 | Write cache line
+| SBI_PMU_HW_CACHE_OP_PREFETCH | 2 | Prefetch cache line
+|===
+
+[#table_pmu_cache_result_id]
+.PMU Cache Operation Result ID
+[cols="6,2,4", width=95%, align="center", options="header"]
+|===
+| Cache Result Name | Result ID | Description
+| SBI_PMU_HW_CACHE_RESULT_ACCESS | 0 | Cache access
+| SBI_PMU_HW_CACHE_RESULT_MISS | 1 | Cache miss
+|===
+
+*NOTE:* The `event_data` (i.e. *event data*) is unused for hardware cache
+events and all non-zero values of `event_data` are reserved for future use.
+
+=== Event: Hardware raw events (Type #2)
+
+The `event_idx.type` (i.e. *event type*) should be `0x2` for all hardware
+raw events and `event_idx.code` (i.e. *event code*) should be zero. The
+`event_data` configuration (or parameter) should have the 56 bits wide
+non-zero event value to be programmed in the `mhpmeventX` CSR. The upper
+8 bits of `event_data` are unused and all non-zero values of these bits
+are reserved for future use.
+
+*Note:* Platform may choose to define the expected value to be written to
+`mhpmeventX` CSR for a hardware event. In case of a hardware general/cache
+events, platform may use the zero-extended `event_idx` as the expected
+value for simplicity.
+
+=== Event: Firmware events (Type #15)
+
+The `event_idx.type` (i.e. *event type*) should be `0xf` for all firmware
+events and each firmware event is identified by an unqiue `event_idx.code`
+(i.e. *event code*) described in the <<table_pmu_firmware_events>> below.
+
+[#table_pmu_firmware_events]
+.PMU Firmware Events
+[cols="6,1,4", width=95%, align="center", options="header"]
+|===
+| Firmware Event Name | Code | Description
+| SBI_PMU_FW_MISALIGNED_LOAD | 0 | Misaligned load trap event
+| SBI_PMU_FW_MISALIGNED_STORE | 1 | Misaligned store trap event
+| SBI_PMU_FW_ACCESS_LOAD | 2 | Load access trap event
+| SBI_PMU_FW_ACCESS_STORE | 3 | Store access trap event
+| SBI_PMU_FW_ILLEGAL_INSN | 4 | Illegal instruction trap event
+| SBI_PMU_FW_SET_TIMER | 5 | Set timer event
+| SBI_PMU_FW_IPI_SENT | 6 | Sent IPI to other HART event
+| SBI_PMU_FW_IPI_RECEIVED | 7 | Received IPI from other
+ HART event
+| SBI_PMU_FW_FENCE_I_SENT | 8 | Sent FENCE.I request to
+ other HART event
+| SBI_PMU_FW_FENCE_I_RECEIVED | 9 | Received FENCE.I request
+ from other HART event
+| SBI_PMU_FW_SFENCE_VMA_SENT | 10 | Sent SFENCE.VMA request
+ to other HART event
+| SBI_PMU_FW_SFENCE_VMA_RECEIVED | 11 | Received SFENCE.VMA request
+ from other HART event
+| SBI_PMU_FW_SFENCE_VMA_ASID_SENT | 12 | Sent SFENCE.VMA with ASID
+ request to other HART event
+| SBI_PMU_FW_SFENCE_VMA_ASID_RECEIVED | 13 | Received SFENCE.VMA with ASID
+ request from other HART event
+| SBI_PMU_FW_HFENCE_GVMA_SENT | 14 | Sent HFENCE.GVMA request to
+ other HART event
+| SBI_PMU_FW_HFENCE_GVMA_RECEIVED | 15 | Received HFENCE.GVMA request
+ from other HART event
+| SBI_PMU_FW_HFENCE_GVMA_VMID_SENT | 16 | Sent HFENCE.GVMA with VMID
+ request to other HART event
+| SBI_PMU_FW_HFENCE_GVMA_VMID_RECEIVED | 17 | Received HFENCE.GVMA with VMID
+ request from other HART event
+| SBI_PMU_FW_HFENCE_VVMA_SENT | 18 | Sent HFENCE.VVMA request to
+ other HART event
+| SBI_PMU_FW_HFENCE_VVMA_RECEIVED | 19 | Received HFENCE.VVMA request
+ from other HART event
+| SBI_PMU_FW_HFENCE_VVMA_ASID_SENT | 20 | Sent HFENCE.VVMA with ASID
+ request to other HART event
+| SBI_PMU_FW_HFENCE_VVMA_ASID_RECEIVED | 21 | Received HFENCE.VVMA with ASID
+ request from other HART event
+|===
+
+*NOTE:* the `event_data` (i.e. *event data*) is unused for firmware events
+and all non-zero values of `event_data` are reserved for future use.
+
+=== Function: Get number of counters (FID #0)
+
+[source, C]
+----
+struct sbiret sbi_pmu_num_counters()
+----
+
+*Returns* the number of counters (both hardware and firmware) in
+`sbiret.value` and always returns `SBI_SUCCESS` in sbiret.error.
+
+=== Function: Get details of a counter (FID #1)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_get_info(unsigned long counter_idx)
+----
+
+Get details about the specified counter such as underlying CSR number,
+width of the counter, type of counter hardware/firmware, etc.
+
+The `counter_info` returned by this SBI call is encoded as follows:
+[source, C]
+----
+ counter_info[11:0] = CSR (12bit CSR number)
+ counter_info[17:12] = Width (One less than number of bits in CSR)
+ counter_info[XLEN-2:18] = Reserved for future use
+ counter_info[XLEN-1] = Type (0 = hardware and 1 = firmware)
+----
+
+If `counter_info.type == 1` then `counter_info.csr` and `counter_info.width`
+should be ignored.
+
+*Returns* the `counter_info` described above in `sbiret.value`.
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_get_info_errors>> below.
+
+[#table_pmu_counter_get_info_errors]
+.PMU Counter Get Info Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code | Description
+| SBI_SUCCESS | `counter_info` read successfully.
+| SBI_ERR_INVALID_PARAM | `counter_idx` points to an invalid counter.
+|===
+
+=== Function: Find and configure a matching counter (FID #2)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_config_matching(unsigned long counter_idx_base,
+ unsigned long counter_idx_mask,
+ unsigned long config_flags,
+ unsigned long event_idx,
+ uint64_t event_data)
+----
+
+Find and configure a counter from a set of counters which is not started
+(or enabled) and can monitor the specified event. The `counter_idx_base`
+and `counter_idx_mask` parameters represent the set of counters whereas
+the `event_idx` represent the event to be monitored and `event_data`
+represents any additional event configuration.
+
+The `config_flags` parameter represent additional counter configuration
+and filter flags. The bit defintions of the `config_flags` parameter are
+shown in the <<table_pmu_counter_cfg_match_flags>> below.
+
+[#table_pmu_counter_cfg_match_flags]
+.PMU Counter Config Match Flags
+[cols="3,1,2", width=90%, align="center", options="header"]
+|===
+| Flag Name | Bits | Description
+| SBI_PMU_CFG_FLAG_SKIP_MATCH | 0:0 | Skip the counter matching
+| SBI_PMU_CFG_FLAG_CLEAR_VALUE| 1:1 | Clear (or zero) the counter
+ value in counter configuration
+| SBI_PMU_CFG_FLAG_AUTO_START | 2:2 | Start the counter after
+ configuring a matching counter
+| SBI_PMU_CFG_FLAG_SET_MINH | 3:3 | Event counting inhibited +
+ in M-mode
+| SBI_PMU_CFG_FLAG_SET_SINH | 4:4 | Event counting inhibited +
+ in S-mode
+| SBI_PMU_CFG_FLAG_SET_UINH | 5:5 | Event counting inhibited +
+ in U-mode
+| SBI_PMU_CFG_FLAG_SET_VSINH | 6:6 | Event counting inhibited +
+ in VS-mode
+| SBI_PMU_CFG_FLAG_SET_VUINH | 7:7 | Event counting inhibited +
+ in VU-mode
+| *RESERVED* | 8:(XLEN-1) | All non-zero values are
+ reserved for future use
+|===
+
+*NOTE:* The config_flags[3:7] bits are event filtering hints so these can
+be ignored or overriden by the SBI implemenation for security concerns or
+due to lack of event filtering support in underlying platform.
+
+*Returns* the `counter_idx` in `sbiret.value` upon success.
+
+In case of failure, the possible error codes returned in `sbiret.error` are
+shown in the <<table_pmu_counter_cfg_match_errors>> below.
+
+[#table_pmu_counter_cfg_match_errors]
+.PMU Counter Config Match Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code | Description
+| SBI_SUCCESS | counter found and configured successfully.
+| SBI_ERR_INVALID_PARAM | set of counters has an invalid counter.
+| SBI_ERR_NOT_SUPPORTED | none of the counters can monitor specified event.
+|===
+
+=== Function: Start a set of counters (FID #4)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_start(unsigned long counter_idx_base,
+ unsigned long counter_idx_mask,
+ unsigned long start_flags,
+ uint64_t initial_value)
+----
+
+Start or enable a sef of counters on the calling HART with the specified
+initial value. The `counter_idx_base` and `counter_idx_mask` parameters
+represent the set of counters whereas the `initial_value` parameter
+specifies the initial value of the counter.
+
+The bit defintions of the `start_flags` parameter are shown in the
+<<table_pmu_counter_start_flags>> below.
+
+[#table_pmu_counter_start_flags]
+.PMU Counter Start Flags
+[cols="3,1,2", width=90%, align="center", options="header"]
+|===
+| Flag Name | Bits | Description
+| SBI_PMU_START_SET_INIT_VALUE | 0:0 | Set the initial value of
+ counters.
+| *RESERVED* | 1:(XLEN-1) | All non-zero values are
+ reserved for future use
+|===
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_start_errors>> below.
+
+[#table_pmu_counter_start_errors]
+.PMU Counter Start Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code | Description
+| SBI_SUCCESS | counter started successfully.
+| SBI_ERR_INVALID_PARAM | some of the counters specified in parameters
+ are invalid.
+| SBI_ERR_ALREADY_STARTED | some of the counters specified in parameters
+ are already started.
+|===
+
+=== Function: Stop a set of counters (FID #4)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_stop(unsigned long counter_idx_base,
+ unsigned long counter_idx_mask,
+ unsigned long stop_flags)
+----
+
+Stop or disable a set of counters on the calling HART. The `counter_idx_base`
+and `counter_idx_mask` parameters represent the set of counters. The bit
+defintions of the `stop_flags` parameter are shown in the
+<<table_pmu_counter_stop_flags>> below.
+
+[#table_pmu_counter_stop_flags]
+.PMU Counter Stop Flags
+[cols="3,1,2", width=90%, align="center", options="header"]
+|===
+| Flag Name | Bits | Description
+| SBI_PMU_STOP_FLAG_RESET | 0:0 | Reset the counter to event mapping.
+| *RESERVED* | 1:(XLEN-1) | All non-zero values are reserved
+ for future use
+|===
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_stop_errors>> below.
+
+[#table_pmu_counter_stop_errors]
+.PMU Counter Stop Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code | Description
+| SBI_SUCCESS | counter stopped successfully.
+| SBI_ERR_INVALID_PARAM | some of the counters specified in parameters
+ are invalid.
+| SBI_ERR_ALREADY_STOPPED | some of the counters specified in parameters
+ are already stopped.
+|===
+
+=== Function: Read a firmware counter (FID #5)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_fw_read(unsigned long counter_idx)
+----
+
+Provide the current value of a firmware counter in `sbiret.value`.
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_fw_read_errors>> below.
+
+[#table_pmu_counter_fw_read_errors]
+.PMU Counter Firmware Read Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code | Description
+| SBI_SUCCESS | firmware counter read successfully.
+| SBI_ERR_INVALID_PARAM | `counter_idx` points to a hardware counter
+ or an invalid counter.
+|===
+
+=== Function Listing
+
+[#table_pmu_function_list]
+.PMU Function List
+[cols="5,2,1,2", width=80%, align="center", options="header"]
+|===
+| Function Name | SBI Version | FID | EID
+| sbi_pmu_num_counters | 0.3 | 0 | 0x504D55
+| sbi_pmu_counter_get_info | 0.3 | 1 | 0x504D55
+| sbi_pmu_counter_config_matching | 0.3 | 2 | 0x504D55
+| sbi_pmu_counter_start | 0.3 | 3 | 0x504D55
+| sbi_pmu_counter_stop | 0.3 | 4 | 0x504D55
+| sbi_pmu_counter_fw_read | 0.3 | 5 | 0x504D55
+|===
+
== Experimental SBI Extension Space (EIDs #0x08000000 - #0x08FFFFFF)

No management.
--
2.25.1


atishp@...
 

On Fri, 2021-05-14 at 13:07 +0530, Anup Patel wrote:
This patch adds SBI performance monitoring unit (PMU) extension which
allows S-mode (or VS-mode) software to configure hardware (or
firmware)
performance counters with help of M-mode (or HS-mode) software.

Signed-off-by: Anup Patel <anup.patel@...>
Signed-off-by: Atish Patra <atish.patra@...>
---
Changes since v5:
- Improved NOTEs on SBI_PMU_HW_CPU_CYCLES, SBI_PMU_HW_REF_CPU_CYCLES,
  and SBI_PMU_HW_BUS_CYCLES events as suggested by Greg Favor
- Re-ordered paramters of sbi_pmu_counter_config_matching() so that
  uint64_t parameter is last one
- Added NOTEs for config_flags[3:7] bits which will be used for event
  filtering based on privilege mode
- Allow sbi_pmu_counter_start() and sbi_pmu_counter_stop() to work on
  a set of counters
- Re-ordered function numbering to make sbi_pmu_counter_fw_read() as
  last function
All the above changes looks good. I have verified those in the
OpenSBI/Linux kernel implementation as well. I will send out the
patches soon.


Changes since v4:
- Simplified "NOTE" section for RAW events
- Resized tables
- Added "SBI version" column to function list table
- Added "NOTE" section for SBI_PMU_HW_BUS_CYCLES and
SBI_PMU_HW_REF_CPU_CYCLE
- Explicity mention that event_data is 64 bits wide
- Only lower 56 bits of event_data are used for RAW events to align
with
  upcoming Sscof extension
Changes since v3:
- The new "sscof" extension requires the event info to be 64 bit.
- Improves:
  1. the counter start/stop function ids with a appropriate error
codes
  2. An additional flag parameter to accomodate "sscof extension".
- Renames:
  a. software counter to firmware counter to avoid ambiguity with
     kernel software events.
  b. event_info to event_data to avoid ambiguity with counter info
Changes since v2:
 - Rebased on makefile and cleanup changes
---
 riscv-sbi.adoc | 446
+++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 446 insertions(+)

diff --git a/riscv-sbi.adoc b/riscv-sbi.adoc
index 16b7380..95dcc00 100644
--- a/riscv-sbi.adoc
+++ b/riscv-sbi.adoc
@@ -34,6 +34,7 @@ https://creativecommons.org/licenses/by/4.0/.
 * Improved SBI introduction secion
 * Improved documentation of SBI hart state managment extension
 * Added suspend function to SBI hart state managment extension
+* Added performance monitoring unit extension
 
 === Version 0.2
 
@@ -114,6 +115,8 @@ error codes.
 |  SBI_ERR_DENIED            | -4
 |  SBI_ERR_INVALID_ADDRESS   | -5
 |  SBI_ERR_ALREADY_AVAILABLE | -6
+|  SBI_ERR_ALREADY_STARTED   | -7
+|  SBI_ERR_ALREADY_STOPPED   | -8
 |===
 
 An `ECALL` with an unsupported SBI extension ID (*EID*) or an
unsupported SBI
@@ -1092,6 +1095,449 @@ The possible error codes returned in
`sbiret.error` are shown in the
 | sbi_system_reset    | 0.3         |   0 | 0x53525354
 |===
 
+== Performance Monitoring Unit Extension (EID #0x504D55 "PMU")
+
+The RISC-V hardware performance counters such as `mcycle`,
`minstret`, and
+`mhpmcounterX` CSRs are accessible as read-only from supervisor-mode
using
+`cycle`, `instret`, and `hpmcounterX` CSRs. The SBI performance
monitoring
+unit (PMU) extension is an interface for supervisor-mode to
configure and
+use the RISC-V hardware performance counters with assistance from
the
+machine-mode (or hypervisor-mode). These hardware performance
counters
+can only be started, stopped, or configured from machine-mode using
+`mcountinhibit` and `mhpmeventX` CSRs. Due to this, a machine-mode
SBI
+implementation may choose to disallow SBI PMU extension if
`mcountinhibit`
+CSR is not implemented by the RISC-V platform.
+
+A RISC-V platform generally supports monitoring of various hardware
events
+using a limited number of hardware performance counters which are up
to
+64 bits wide. In addition, a SBI implementation can also provide
firmware
+performance counters which can monitor firmware events such as
number of
+misaligned load/store instructions, number of RFENCEs, number of
IPIs, etc.
+The firmware counters are always 64 bits wide.
+
+The SBI PMU extension provides:
+
+1. An interface for supervisor-mode software to discover and
configure
+   per-HART hardware/firmware counters
+2. A typical
https://en.wikipedia.org/wiki/Perf_(Linux)[perf] compatible
+   interface for hardware/firmware performance counters and events
+3. Full access to microarchitecture's raw event encodings
+
+To define SBI PMU extension calls, we first define important
entities
+`counter_idx`, `event_idx`, and `event_data`. The `counter_idx` is a
+logical number assigned to each hardware/firmware counter. The
`event_idx`
+represents a hardware (or firmware) event whereas the `event_data`
is
+64 bits wide and represents additional configuration (or parameters)
for
+a hardware (or firmware) event.
+
+The event_idx is a 20 bits wide number encoded as follows:
+[source, C]
+----
+    event_idx[19:16] = type
+    event_idx[15:0] = code
+----
+
+=== Event: Hardware general events (Type #0)
+
+The `event_idx.type` (i.e. *event type*) should be `0x0` for all
hardware
+general events and each hardware general event is identified by an
unique
+`event_idx.code` (i.e. *event code*) described in the
+<<table_pmu_hardware_events>> below.
+
+[#table_pmu_hardware_events]
+.PMU Hardware Events
+[cols="6,1,4", width=95%, align="center", options="header"]
+|===
+| General Event Name                 | Code | Description
+| SBI_PMU_HW_NO_EVENT                |    0 | Unused event because
+                                              `event_idx` cannot be
zero
+| SBI_PMU_HW_CPU_CYCLES              |    1 | Event for each CPU
cycle
+| SBI_PMU_HW_INSTRUCTIONS            |    2 | Event for each
completed
+                                              instruction
+| SBI_PMU_HW_CACHE_REFERENCES        |    3 | Event for cache hit
+| SBI_PMU_HW_CACHE_MISSES            |    4 | Event for cache miss
+| SBI_PMU_HW_BRANCH_INSTRUCTIONS     |    5 | Event for a branch
instruction
+| SBI_PMU_HW_BRANCH_MISSES           |    6 | Event for a branch
misprediction
+| SBI_PMU_HW_BUS_CYCLES              |    7 | Event for each BUS
cycle
+| SBI_PMU_HW_STALLED_CYCLES_FRONTEND |    8 | Event for a stalled
cycle in
+                                              microarchitecture
frontend
+| SBI_PMU_HW_STALLED_CYCLES_BACKEND  |    9 | Event for a stalled
cycle in
+                                              microarchitecture
backend
+| SBI_PMU_HW_REF_CPU_CYCLES          |   10 | Event for each
reference
+                                              CPU cycle
+|===
+
+*NOTE:* The `event_data` (i.e. *event data*) is unused for hardware
+general events and all non-zero values of `event_data` are reserved
+for future use.
+
+*NOTE:* The *SBI_PMU_HW_CPU_CYCLES* event counts CPU clock cycles as
+counted by the `cycle` CSR. These may be variable frequency cycles,
and
+are not counted when the CPU clock is halted (i.e. WAIT state
entered
+using WFI instruction or platform specific SUSPEND state entered
using
+the SBI HSM HART suspend call).
+
+*NOTE:* The *SBI_PMU_HW_REF_CPU_CYCLES* counts fixed-frequency clock
+cycles while the CPU clock is not halted (i.e. WAIT state entered
using
+WFI instruction or platform specific SUSPEND state entered SBI HSM
HART
+suspend call). The fixed-frequency of counting might, for example,
be
+the same frequency at which the `mtime` CSR counts.
+
+*NOTE:* The *SBI_PMU_HW_BUS_CYCLES* counts fixed-frequency clock
cycles.
+The fixed-frequency of counting might be the same frequency at which
the
+`mtime` CSR counts, or may be the frequency of the clock at the
boundary
+between the HART (and it's private caches) and the rest of the
system.
+
+=== Event: Hardware cache events (Type #1)
+
+The `event_idx.type` (i.e. *event type*) should be `0x1` for all
hardware
+cache events and each hardware cache event is identified by an
unique
+`event_idx.code` (i.e. *event code*) which is encoded as follows:
+
+[source, C]
+----
+    event_idx.code[15:3] = cache_id
+    event_idx.code[2:1] = op_id
+    event_idx.code[0:0] = result_id
+----
+
+Below tables show possible values of: `event_idx.code.cache_id`
(i.e.
+*cache event id*), `event_idx.code.op_id` (i.e. *cache operation
id*)
+and `event_idx.code.result_id` (i.e. *cache result id*).
+
+[#table_pmu_cache_event_id]
+.PMU Cache Event ID
+[cols="6,2,4", width=95%, align="center", options="header"]
+|===
+| Cache Event Name      | Event ID | Description
+| SBI_PMU_HW_CACHE_L1D  |        0 | Level1 data cache event
+| SBI_PMU_HW_CACHE_L1I  |        1 | Level1 instruction cache event
+| SBI_PMU_HW_CACHE_LL   |        2 | Last level cache event
+| SBI_PMU_HW_CACHE_DTLB |        3 | Data TLB event
+| SBI_PMU_HW_CACHE_ITLB |        4 | Instruction TLB event
+| SBI_PMU_HW_CACHE_BPU  |        5 | Branch predictor unit event
+| SBI_PMU_HW_CACHE_NODE |        6 | NUMA node cache event
+|===
+
+[#table_pmu_cache_ops_id]
+.PMU Cache Operation ID
+[cols="6,2,4", width=95%, align="center", options="header"]
+|===
+| Cache Operation Name         | Operation ID | Description
+| SBI_PMU_HW_CACHE_OP_READ     |            0 | Read cache line
+| SBI_PMU_HW_CACHE_OP_WRITE    |            1 | Write cache line
+| SBI_PMU_HW_CACHE_OP_PREFETCH |            2 | Prefetch cache line
+|===
+
+[#table_pmu_cache_result_id]
+.PMU Cache Operation Result ID
+[cols="6,2,4", width=95%, align="center", options="header"]
+|===
+| Cache Result Name              | Result ID | Description
+| SBI_PMU_HW_CACHE_RESULT_ACCESS |         0 | Cache access
+| SBI_PMU_HW_CACHE_RESULT_MISS   |         1 | Cache miss
+|===
+
+*NOTE:* The `event_data` (i.e. *event data*) is unused for hardware
cache
+events and all non-zero values of `event_data` are reserved for
future use.
+
+=== Event: Hardware raw events (Type #2)
+
+The `event_idx.type` (i.e. *event type*) should be `0x2` for all
hardware
+raw events and `event_idx.code` (i.e. *event code*) should be zero.
The
+`event_data` configuration (or parameter) should have the 56 bits
wide
+non-zero event value to be programmed in the `mhpmeventX` CSR. The
upper
+8 bits of `event_data` are unused and all non-zero values of these
bits
+are reserved for future use.
+
+*Note:* Platform may choose to define the expected value to be
written to
+`mhpmeventX` CSR for a hardware event. In case of a hardware
general/cache
+events, platform may use the zero-extended `event_idx` as the
expected
+value for simplicity.
+
+=== Event: Firmware events (Type #15)
+
+The `event_idx.type` (i.e. *event type*) should be `0xf` for all
firmware
+events and each firmware event is identified by an unqiue
`event_idx.code`
+(i.e. *event code*) described in the <<table_pmu_firmware_events>>
below.
+
+[#table_pmu_firmware_events]
+.PMU Firmware Events
+[cols="6,1,4", width=95%, align="center", options="header"]
+|===
+| Firmware Event Name                  | Code | Description
+| SBI_PMU_FW_MISALIGNED_LOAD           |    0 | Misaligned load trap
event
+| SBI_PMU_FW_MISALIGNED_STORE          |    1 | Misaligned store
trap event
+| SBI_PMU_FW_ACCESS_LOAD               |    2 | Load access trap
event
+| SBI_PMU_FW_ACCESS_STORE              |    3 | Store access trap
event
+| SBI_PMU_FW_ILLEGAL_INSN              |    4 | Illegal instruction
trap event
+| SBI_PMU_FW_SET_TIMER                 |    5 | Set timer event
+| SBI_PMU_FW_IPI_SENT                  |    6 | Sent IPI to other
HART event
+| SBI_PMU_FW_IPI_RECEIVED              |    7 | Received IPI from
other
+                                                HART event
+| SBI_PMU_FW_FENCE_I_SENT              |    8 | Sent FENCE.I request
to
+                                                other HART event
+| SBI_PMU_FW_FENCE_I_RECEIVED          |    9 | Received FENCE.I
request
+                                                from other HART
event
+| SBI_PMU_FW_SFENCE_VMA_SENT           |   10 | Sent SFENCE.VMA
request
+                                                to other HART event
+| SBI_PMU_FW_SFENCE_VMA_RECEIVED       |   11 | Received SFENCE.VMA
request
+                                                from other HART
event
+| SBI_PMU_FW_SFENCE_VMA_ASID_SENT      |   12 | Sent SFENCE.VMA with
ASID
+                                                request to other
HART event
+| SBI_PMU_FW_SFENCE_VMA_ASID_RECEIVED  |   13 | Received SFENCE.VMA
with ASID
+                                                request from other
HART event
+| SBI_PMU_FW_HFENCE_GVMA_SENT          |   14 | Sent HFENCE.GVMA
request to
+                                                other HART event
+| SBI_PMU_FW_HFENCE_GVMA_RECEIVED      |   15 | Received HFENCE.GVMA
request
+                                                from other HART
event
+| SBI_PMU_FW_HFENCE_GVMA_VMID_SENT     |   16 | Sent HFENCE.GVMA
with VMID
+                                                request to other
HART event
+| SBI_PMU_FW_HFENCE_GVMA_VMID_RECEIVED |   17 | Received HFENCE.GVMA
with VMID
+                                                request from other
HART event
+| SBI_PMU_FW_HFENCE_VVMA_SENT          |   18 | Sent HFENCE.VVMA
request to
+                                                other HART event
+| SBI_PMU_FW_HFENCE_VVMA_RECEIVED      |   19 | Received HFENCE.VVMA
request
+                                                from other HART
event
+| SBI_PMU_FW_HFENCE_VVMA_ASID_SENT     |   20 | Sent HFENCE.VVMA
with ASID
+                                                request to other
HART event
+| SBI_PMU_FW_HFENCE_VVMA_ASID_RECEIVED |   21 | Received HFENCE.VVMA
with ASID
+                                                request from other
HART event
+|===
+
+*NOTE:* the `event_data` (i.e. *event data*) is unused for firmware
events
+and all non-zero values of `event_data` are reserved for future use.
+
+=== Function: Get number of counters (FID #0)
+
+[source, C]
+----
+struct sbiret sbi_pmu_num_counters()
+----
+
+*Returns* the number of counters (both hardware and firmware) in
+`sbiret.value` and always returns `SBI_SUCCESS` in sbiret.error.
+
+=== Function: Get details of a counter (FID #1)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_get_info(unsigned long counter_idx)
+----
+
+Get details about the specified counter such as underlying CSR
number,
+width of the counter, type of counter hardware/firmware, etc.
+
+The `counter_info` returned by this SBI call is encoded as follows:
+[source, C]
+----
+    counter_info[11:0] = CSR (12bit CSR number)
+    counter_info[17:12] = Width (One less than number of bits in
CSR)
+    counter_info[XLEN-2:18] = Reserved for future use
+    counter_info[XLEN-1] = Type (0 = hardware and 1 = firmware)
+----
+
+If `counter_info.type == 1` then `counter_info.csr` and
`counter_info.width`
+should be ignored.
+
+*Returns* the `counter_info` described above in `sbiret.value`.
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_get_info_errors>> below.
+
+[#table_pmu_counter_get_info_errors]
+.PMU Counter Get Info Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code            | Description
+| SBI_SUCCESS           | `counter_info` read successfully.
+| SBI_ERR_INVALID_PARAM | `counter_idx` points to an invalid
counter.
+|===
+
+=== Function: Find and configure a matching counter (FID #2)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_config_matching(unsigned long
counter_idx_base,
+                                             unsigned long
counter_idx_mask,
+                                             unsigned long
config_flags,
+                                             unsigned long
event_idx,
+                                             uint64_t event_data)
+----
+
+Find and configure a counter from a set of counters which is not
started
+(or enabled) and can monitor the specified event. The
`counter_idx_base`
+and `counter_idx_mask` parameters represent the set of counters
whereas
+the `event_idx` represent the event to be monitored and `event_data`
+represents any additional event configuration.
+
+The `config_flags` parameter represent additional counter
configuration
+and filter flags. The bit defintions of the `config_flags` parameter
are
+shown in the <<table_pmu_counter_cfg_match_flags>> below.
+
+[#table_pmu_counter_cfg_match_flags]
+.PMU Counter Config Match Flags
+[cols="3,1,2", width=90%, align="center", options="header"]
+|===
+| Flag Name                   | Bits       | Description
+| SBI_PMU_CFG_FLAG_SKIP_MATCH | 0:0        | Skip the counter
IMO, some additional clarification along the following lines will be
helpful.

SBI_PMU_CFG_FLAG_SKIP_MATCH should only be used when the caller is
already aware of the event to counter mapping (from previous time).
In that case, the counter id must be passed in `counter_idx_base` and
the `counter_idx_mask` should be set to 1.

matching
+| SBI_PMU_CFG_FLAG_CLEAR_VALUE| 1:1        | Clear (or zero) the
counter
+                                             value in counter
configuration
+| SBI_PMU_CFG_FLAG_AUTO_START | 2:2        | Start the counter after
+                                             configuring a matching
We can be bit explicit here
This flag doesn't modify the existing value of the counter.

counter
+| SBI_PMU_CFG_FLAG_SET_MINH   | 3:3        | Event counting
inhibited +
+                                             in M-mode
+| SBI_PMU_CFG_FLAG_SET_SINH   | 4:4        | Event counting
inhibited +
+                                             in S-mode
+| SBI_PMU_CFG_FLAG_SET_UINH   | 5:5        | Event counting
inhibited +
+                                             in U-mode
+| SBI_PMU_CFG_FLAG_SET_VSINH  | 6:6        | Event counting
inhibited +
+                                             in VS-mode
+| SBI_PMU_CFG_FLAG_SET_VUINH  | 7:7        | Event counting
inhibited +
+                                             in VU-mode
+| *RESERVED*                  | 8:(XLEN-1) | All non-zero values are
+                                             reserved for future use
+|===
+
+*NOTE:* The config_flags[3:7] bits are event filtering hints so
these can
+be ignored or overriden by the SBI implemenation for security
concerns or
+due to lack of event filtering support in underlying platform.
+
+*Returns* the `counter_idx` in `sbiret.value` upon success.
+
+In case of failure, the possible error codes returned in
`sbiret.error` are
+shown in the <<table_pmu_counter_cfg_match_errors>> below.
+
+[#table_pmu_counter_cfg_match_errors]
+.PMU Counter Config Match Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code            | Description
+| SBI_SUCCESS           | counter found and configured successfully.
+| SBI_ERR_INVALID_PARAM | set of counters has an invalid counter.
+| SBI_ERR_NOT_SUPPORTED | none of the counters can monitor specified
event.
+|===
+
+=== Function: Start a set of counters (FID #4)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_start(unsigned long counter_idx_base,
+                                   unsigned long counter_idx_mask,
+                                   unsigned long start_flags,
+                                   uint64_t initial_value)
+----
+
+Start or enable a sef of counters on the calling HART with the
specified
+initial value. The `counter_idx_base` and `counter_idx_mask`
parameters
+represent the set of counters whereas the `initial_value` parameter
+specifies the initial value of the counter.
+
+The bit defintions of the `start_flags` parameter are shown in the
+<<table_pmu_counter_start_flags>> below.
+
+[#table_pmu_counter_start_flags]
+.PMU Counter Start Flags
+[cols="3,1,2", width=90%, align="center", options="header"]
+|===
+| Flag Name                    | Bits       | Description
+| SBI_PMU_START_SET_INIT_VALUE | 0:0        | Set the initial value.
We can be bit explicit here

Set the initial value if true. The counter value is not modified if
false.

+                                              counters.
+| *RESERVED*                   | 1:(XLEN-1) | All non-zero values
are
+                                              reserved for future
use
+|===
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_start_errors>> below.
+
+[#table_pmu_counter_start_errors]
+.PMU Counter Start Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code              | Description
+| SBI_SUCCESS             | counter started successfully.
+| SBI_ERR_INVALID_PARAM   | some of the counters specified in
parameters
+                            are invalid.
+| SBI_ERR_ALREADY_STARTED | some of the counters specified in
parameters
+                            are already started.
+|===
+
+=== Function: Stop a set of counters (FID #4)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_stop(unsigned long counter_idx_base,
+                                   unsigned long counter_idx_mask,
+                                   unsigned long stop_flags)
+----
+
+Stop or disable a set of counters on the calling HART. The
`counter_idx_base`
+and `counter_idx_mask` parameters represent the set of counters. The
bit
+defintions of the `stop_flags` parameter are shown in the
+<<table_pmu_counter_stop_flags>> below.
+
+[#table_pmu_counter_stop_flags]
+.PMU Counter Stop Flags
+[cols="3,1,2", width=90%, align="center", options="header"]
+|===
+| Flag Name               | Bits       | Description
+| SBI_PMU_STOP_FLAG_RESET | 0:0        | Reset the counter to event
mapping.
+| *RESERVED*              | 1:(XLEN-1) | All non-zero values are
reserved
+                                         for future use
+|===
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_stop_errors>> below.
+
+[#table_pmu_counter_stop_errors]
+.PMU Counter Stop Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code              | Description
+| SBI_SUCCESS             | counter stopped successfully.
+| SBI_ERR_INVALID_PARAM   | some of the counters specified in
parameters
+                            are invalid.
+| SBI_ERR_ALREADY_STOPPED | some of the counters specified in
parameters
+                            are already stopped.
+|===
+
+=== Function: Read a firmware counter (FID #5)
+
+[source, C]
+----
+struct sbiret sbi_pmu_counter_fw_read(unsigned long counter_idx)
+----
+
+Provide the current value of a firmware counter in `sbiret.value`.
+
+The possible error codes returned in `sbiret.error` are shown in the
+<<table_pmu_counter_fw_read_errors>> below.
+
+[#table_pmu_counter_fw_read_errors]
+.PMU Counter Firmware Read Errors
+[cols="2,3", width=90%, align="center", options="header"]
+|===
+| Error code            | Description
+| SBI_SUCCESS           | firmware counter read successfully.
+| SBI_ERR_INVALID_PARAM | `counter_idx` points to a hardware counter
+                          or an invalid counter.
+|===
+
+=== Function Listing
+
+[#table_pmu_function_list]
+.PMU Function List
+[cols="5,2,1,2", width=80%, align="center", options="header"]
+|===
+| Function Name                   | SBI Version | FID | EID
+| sbi_pmu_num_counters            | 0.3         | 0   | 0x504D55
+| sbi_pmu_counter_get_info        | 0.3         | 1   | 0x504D55
+| sbi_pmu_counter_config_matching | 0.3         | 2   | 0x504D55
+| sbi_pmu_counter_start           | 0.3         | 3   | 0x504D55
+| sbi_pmu_counter_stop            | 0.3         | 4   | 0x504D55
+| sbi_pmu_counter_fw_read         | 0.3         | 5   | 0x504D55
+|===
+
 == Experimental SBI Extension Space (EIDs #0x08000000 - #0x08FFFFFF)
 
 No management.
--
Regards,
Atish