Date   

Re: Proposal v3: SBI PMU Extension

Zong Li
 

On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:



-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 14 July 2020 09:02
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension

On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have
HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and HPMCOUNTER
CSRs. A RISC-V implementation can support monitoring various HARDWARE
events using limited number of HPMCOUNTER CSRs.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The
SBI
PMU extension will only manage per-HART (or per-CPU)
HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs and
SOFTWARE counters provided by SBI implementation.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or
Xvisor) will provide a standardized view of HARDWARE/SOFTWARE
counters
and events to S-mode (or VS-mode) software.

To define SBI PMU extension, we first define counter_idx which is a
logical number assigned to a counter and event_idx which is an encoded
number representing the HARDWARE/SOFTWARE event to be monitored.
A
HARDWARE/SOFTWARE event can also have additional
configuration/details
referred to as event_info.

The SBI PMU event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE
event, the event_info is not required whereas the event_idx.code can
be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For
HARDWARE
CACHE event, the event_info is not required whereas the event_idx.code
is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum
sbi_pmu_hw_cache_op_id
{
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_SET_EVENT call (described
below)
will have the RAW event value to be programmed in MHPMEVENT CSR (i.e.
the SBI implementation will not derive MHPMEVENT CSR value from
event_idx + event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is not required whereas the event_idx.code can
be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking SBI call
compatibility of SBI calls.

Using definition of counter_idx and event_idx, we can potentially have
the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
It will provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in
HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.

3. SBI_PMU_COUNTER_SET_EVENT
This call takes three parameter:
1) counter_idx
2) event_idx
3) event_info
It will select an event to be monitored by given counter. If this
SBI call is not used for a counter to select an event then the
counter will montior default event selected for it at boot-time.
This SBI call will fail for counters which are not present. It will
also fail if specified event_idx + event_info combination is not
supported by given counter.
It also seems to fail if the specified event is not supported by the given
counter, right? Then Linux driver could try to allocate the next free counter
when returning failure from this SBI calls.
Yes, this call will fail if event_idx + event_info combination is not supported
by given counter_idx. It is expected that Linux driver will try another
free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I have suggested
few ideas on how to reduce SBI_PMU_COUNTER_SET_EVENT calls by
looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into
ret.value ($a1)
if it fails for the given counter, then s-mode software can
conveniently find which
next one is a good counter for the event. But there is a constraint
that the maximum
number of counters need to assume to be less than XLEN. Do you think
it is feasible?



Apart from this question above, this version of the proposal is great to me.
Cool

Regards,
Anup


Thanks,
Zong


4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number > 0xfff).

5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.

6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

The M-mode runtime firmware (OpenSBI) Development Notes:

1. The M-mode runtime firmware will have to translate SBI PMU
event_idx and event_into into platform dependent MHPMEVENT CSR
value before starting/enabling a HARDWARE counter.

2. The M-mode runtime firmware (OpenSBI) will need to know following
platform dependent information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. HPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will
write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20
bits of event_info
C) Additional platform-specific progamming required for selecting
event_idx + event_info combination. This is also optional for
platform.

3. All platform dependent information mentioned above, can be obtained
by M-mode runtime firmware (OpenSBI) from platform specific code.
The DT/ACPI can also be used to describe 2.A and 2.B mentioned above
but 2.C will always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify an edge-triggered
overflow interrupt for each HART. When "interrupts" DT property is
present, we might also need another DT property for mapping HARTID
to entries in "interrupts" DT property. The platform driver probe
will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver skip this in driver probe and instead
do this lazily in add() callback mentioned below.

2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by event_add())

3. add() callback
The add() callback of Linux RISC-V PMU driver will find a
free counter on current CPU/HART such that the perf_event
event_idx + event_info combination is supported by the counter.
To check-and-set event_idx + event_info combination for a
counter, we will use the SBI_PMU_COUNTER_SET_EVENT call.
The counter allocation and SBI_PMU_COUNTER_SET_EVENT call
can be futher optimized by looking at CSR details.
For example:
A) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should
prefer counter mapping to CYCLE CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
B) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should
prefer counter mapping to INSTRET CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
C) For event_idx == 0xf, only perfer counters mapping
to 0xfff CSR (i.e. SOFTWARE counters).

4. del() callback
The del() callback of Linux RISC-V PMU driver will release
or free the counter.

5. start() callback
The start() callback of Linux RISC-V PMU driver will start
the counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop
the counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: Proposal v3: SBI PMU Extension

Anup Patel
 

-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 15 July 2020 07:40
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension

On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:



-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 14 July 2020 09:02
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...;
gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension

On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...>
wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have
HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and
HPMCOUNTER CSRs. A RISC-V implementation can support monitoring
various HARDWARE events using limited number of HPMCOUNTER
CSRs.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of
misaligned load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU)
HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs
and
SOFTWARE counters provided by SBI implementation.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or
Xvisor) will provide a standardized view of HARDWARE/SOFTWARE
counters
and events to S-mode (or VS-mode) software.

To define SBI PMU extension, we first define counter_idx which is
a logical number assigned to a counter and event_idx which is an
encoded number representing the HARDWARE/SOFTWARE event to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional
configuration/details
referred to as event_info.

The SBI PMU event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE
event, the event_info is not required whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For
HARDWARE
CACHE event, the event_info is not required whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum
sbi_pmu_hw_cache_op_id
{
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For
HARDWARE RAW event, the event_idx.code should be zero and the
event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call
(described
below)
will have the RAW event value to be programmed in MHPMEVENT CSR
(i.e.
the SBI implementation will not derive MHPMEVENT CSR value from
event_idx + event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is not required whereas the event_idx.code
can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking SBI call
compatibility of SBI calls.

Using definition of counter_idx and event_idx, we can potentially
have the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
It will provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits
implemented in
HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.

3. SBI_PMU_COUNTER_SET_EVENT
This call takes three parameter:
1) counter_idx
2) event_idx
3) event_info
It will select an event to be monitored by given counter. If this
SBI call is not used for a counter to select an event then the
counter will montior default event selected for it at boot-time.
This SBI call will fail for counters which are not present. It will
also fail if specified event_idx + event_info combination is not
supported by given counter.
It also seems to fail if the specified event is not supported by the
given counter, right? Then Linux driver could try to allocate the
next free counter when returning failure from this SBI calls.
Yes, this call will fail if event_idx + event_info combination is not
supported by given counter_idx. It is expected that Linux driver will
try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I
have suggested few ideas on how to reduce
SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into
ret.value ($a1) if it fails for the given counter, then s-mode software can
conveniently find which next one is a good counter for the event. But there
is a constraint that the maximum number of counters need to assume to be
less than XLEN. Do you think it is feasible?
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.

How about this ??

3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.




Apart from this question above, this version of the proposal is great to
me.

Cool

Regards,
Anup


Thanks,
Zong


4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number > 0xfff).

5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.

6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

The M-mode runtime firmware (OpenSBI) Development Notes:

1. The M-mode runtime firmware will have to translate SBI PMU
event_idx and event_into into platform dependent MHPMEVENT CSR
value before starting/enabling a HARDWARE counter.

2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. HPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will
write a value <xyz> to MHPMEVENT CSR where lower 20bits of
<xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20
bits of event_info
C) Additional platform-specific progamming required for selecting
event_idx + event_info combination. This is also optional for
platform.

3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific code.
The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify an edge-triggered
overflow interrupt for each HART. When "interrupts" DT property is
present, we might also need another DT property for mapping HARTID
to entries in "interrupts" DT property. The platform driver probe
will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver skip this in driver probe and instead
do this lazily in add() callback mentioned below.

2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by event_add())

3. add() callback
The add() callback of Linux RISC-V PMU driver will find a
free counter on current CPU/HART such that the perf_event
event_idx + event_info combination is supported by the counter.
To check-and-set event_idx + event_info combination for a
counter, we will use the SBI_PMU_COUNTER_SET_EVENT call.
The counter allocation and SBI_PMU_COUNTER_SET_EVENT call
can be futher optimized by looking at CSR details.
For example:
A) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should
prefer counter mapping to CYCLE CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
B) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should
prefer counter mapping to INSTRET CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
C) For event_idx == 0xf, only perfer counters mapping
to 0xfff CSR (i.e. SOFTWARE counters).

4. del() callback
The del() callback of Linux RISC-V PMU driver will release
or free the counter.

5. start() callback
The start() callback of Linux RISC-V PMU driver will start
the counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop
the counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: Proposal v3: SBI PMU Extension

Greg Favor
 

On Tue, Jul 14, 2020 at 9:32 PM Anup Patel <Anup.Patel@...> wrote:
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.

Why not have SBI_PMU_COUNTER_SET_EVENT return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)?  Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change.  The caller can simply call it again with a different counter_idx that it knows should succeed.

Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event?  Then no bit mask ever needs to be returned.

Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes?  (And any returned bit mask would also reflect only counters that have their mcounteren bits set to '1'.)

Greg
 
How about this ??

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   It will find and configure a counter from a set of counters which can
   monitor specified event. The counter_idx_base and counter_idx_mask
   parameters represent the set of counters whereas the event_idx and
   event_info represent the event to monitor. Upon success the SBI call
   will return the counter_idx of the counter which has been configured
   to monitor specified event.  This SBI call will fail if it is unable to find
   a counter which can monitor specified event. It will also fail if the set of
   counters pointers specified via counter_idx_base and counter_idx_mask
   has an invalid counter.


Re: Proposal v3: SBI PMU Extension

Greg Favor
 

Anup,

What is the plan with regards to scounteren and hcounteren?  Is the caller (whether an OS or a hypervisor) supposed to take into account the relevant *counteren CSR's when specifying counter_idx in the call to SBI_PMU_COUNTER_SET_EVENT?  And the M-mode SBI_PMU_COUNTER_SET_EVENT routine only worries about mcounteren?

It seems like this is necessary since the M-mode SBI_PMU_COUNTER_SET_EVENT routine won't know which privilege mode was the original requester (and hence which other *counteren CSR's are relevant).

Greg


On Tue, Jul 14, 2020 at 10:31 PM Greg Favor via lists.riscv.org <gfavor=ventanamicro.com@...> wrote:
On Tue, Jul 14, 2020 at 9:32 PM Anup Patel <Anup.Patel@...> wrote:
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.

Why not have SBI_PMU_COUNTER_SET_EVENT return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)?  Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change.  The caller can simply call it again with a different counter_idx that it knows should succeed.

Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event?  Then no bit mask ever needs to be returned.

Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes?  (And any returned bit mask would also reflect only counters that have their mcounteren bits set to '1'.)

Greg
 
How about this ??

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   It will find and configure a counter from a set of counters which can
   monitor specified event. The counter_idx_base and counter_idx_mask
   parameters represent the set of counters whereas the event_idx and
   event_info represent the event to monitor. Upon success the SBI call
   will return the counter_idx of the counter which has been configured
   to monitor specified event.  This SBI call will fail if it is unable to find
   a counter which can monitor specified event. It will also fail if the set of
   counters pointers specified via counter_idx_base and counter_idx_mask
   has an invalid counter.


Re: Proposal v3: SBI PMU Extension

Zong Li
 

On Wed, Jul 15, 2020 at 12:32 PM Anup Patel <Anup.Patel@...> wrote:



-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 15 July 2020 07:40
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension

On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...> wrote:



-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 14 July 2020 09:02
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...;
gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension

On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...>
wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have
HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and
HPMCOUNTER CSRs. A RISC-V implementation can support monitoring
various HARDWARE events using limited number of HPMCOUNTER
CSRs.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of
misaligned load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU)
HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER CSRs
and
SOFTWARE counters provided by SBI implementation.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or
Xvisor) will provide a standardized view of HARDWARE/SOFTWARE
counters
and events to S-mode (or VS-mode) software.

To define SBI PMU extension, we first define counter_idx which is
a logical number assigned to a counter and event_idx which is an
encoded number representing the HARDWARE/SOFTWARE event to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional
configuration/details
referred to as event_info.

The SBI PMU event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE
event, the event_info is not required whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For
HARDWARE
CACHE event, the event_info is not required whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum
sbi_pmu_hw_cache_op_id
{
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For
HARDWARE RAW event, the event_idx.code should be zero and the
event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT call
(described
below)
will have the RAW event value to be programmed in MHPMEVENT CSR
(i.e.
the SBI implementation will not derive MHPMEVENT CSR value from
event_idx + event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is not required whereas the event_idx.code
can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking SBI call
compatibility of SBI calls.

Using definition of counter_idx and event_idx, we can potentially
have the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
It will provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits
implemented in
HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.

3. SBI_PMU_COUNTER_SET_EVENT
This call takes three parameter:
1) counter_idx
2) event_idx
3) event_info
It will select an event to be monitored by given counter. If this
SBI call is not used for a counter to select an event then the
counter will montior default event selected for it at boot-time.
This SBI call will fail for counters which are not present. It will
also fail if specified event_idx + event_info combination is not
supported by given counter.
It also seems to fail if the specified event is not supported by the
given counter, right? Then Linux driver could try to allocate the
next free counter when returning failure from this SBI calls.
Yes, this call will fail if event_idx + event_info combination is not
supported by given counter_idx. It is expected that Linux driver will
try another free counter if SBI_PMU_COUNTER_SET_EVENT call fails. I
have suggested few ideas on how to reduce
SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event into
ret.value ($a1) if it fails for the given counter, then s-mode software can
conveniently find which next one is a good counter for the event. But there
is a constraint that the maximum number of counters need to assume to be
less than XLEN. Do you think it is feasible?
Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.

How about this ??

3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
It looks like the m-mode firmware would be responsible for selecting
a suitable counter and taking the allocation work for s-mode software?
Could you elaborate or give a example to show how counter_idx_base
and counter_idx_mask represent the set of counters? It seems likes
the set of counters can be represented by one parameter, each bit is
corresponding to one counter_idx.




Apart from this question above, this version of the proposal is great to
me.

Cool

Regards,
Anup


Thanks,
Zong


4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number > 0xfff).

5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.

6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

The M-mode runtime firmware (OpenSBI) Development Notes:

1. The M-mode runtime firmware will have to translate SBI PMU
event_idx and event_into into platform dependent MHPMEVENT CSR
value before starting/enabling a HARDWARE counter.

2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. HPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will
write a value <xyz> to MHPMEVENT CSR where lower 20bits of
<xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20
bits of event_info
C) Additional platform-specific progamming required for selecting
event_idx + event_info combination. This is also optional for
platform.

3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific code.
The DT/ACPI can also be used to describe 2.A and 2.B mentioned
above
but 2.C will always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify an edge-triggered
overflow interrupt for each HART. When "interrupts" DT property is
present, we might also need another DT property for mapping HARTID
to entries in "interrupts" DT property. The platform driver probe
will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver skip this in driver probe and instead
do this lazily in add() callback mentioned below.

2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by event_add())

3. add() callback
The add() callback of Linux RISC-V PMU driver will find a
free counter on current CPU/HART such that the perf_event
event_idx + event_info combination is supported by the counter.
To check-and-set event_idx + event_info combination for a
counter, we will use the SBI_PMU_COUNTER_SET_EVENT call.
The counter allocation and SBI_PMU_COUNTER_SET_EVENT call
can be futher optimized by looking at CSR details.
For example:
A) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should
prefer counter mapping to CYCLE CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
B) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should
prefer counter mapping to INSTRET CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
C) For event_idx == 0xf, only perfer counters mapping
to 0xfff CSR (i.e. SOFTWARE counters).

4. del() callback
The del() callback of Linux RISC-V PMU driver will release
or free the counter.

5. start() callback
The start() callback of Linux RISC-V PMU driver will start
the counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop
the counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: Proposal v3: SBI PMU Extension

Anup Patel
 

Hi Greg,

 

The SBI PMU extension provider for HS-mode is M-mode runtime firmware (OpenSBI) and for VS-mode the provider is HS-mode (Hypervisor).

 

We will enable HARDWARE counters in HCOUNTEREN CSR when hypervisor receives SBI_PMU_COUNTER_START call from VS-mode.

 

The standard interface between Linux user-space and Linux kernel-space is perf SYCALLs/IOCTLs. Despite this, we can support apps who want direct HARDWARE CSR access by enabling HARDWARE counter in SCOUNTEREN CSR. The Linux RISC-V PMU driver can do this in the start() callback.

 

Regards,

Anup

 

From: tech-unixplatformspec@... <tech-unixplatformspec@...> On Behalf Of Greg Favor
Sent: 15 July 2020 11:08
To: Greg Favor <gfavor@...>
Cc: Anup Patel <Anup.Patel@...>; Zong Li <zong.li@...>; tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; andrew@...
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v3: SBI PMU Extension

 

Anup,

 

What is the plan with regards to scounteren and hcounteren?  Is the caller (whether an OS or a hypervisor) supposed to take into account the relevant *counteren CSR's when specifying counter_idx in the call to SBI_PMU_COUNTER_SET_EVENT?  And the M-mode SBI_PMU_COUNTER_SET_EVENT routine only worries about mcounteren?

 

It seems like this is necessary since the M-mode SBI_PMU_COUNTER_SET_EVENT routine won't know which privilege mode was the original requester (and hence which other *counteren CSR's are relevant).

 

Greg

 

 

On Tue, Jul 14, 2020 at 10:31 PM Greg Favor via lists.riscv.org <gfavor=ventanamicro.com@...> wrote:

On Tue, Jul 14, 2020 at 9:32 PM Anup Patel <Anup.Patel@...> wrote:

Yes, this is feasible and can further reduce SBI calls but if we go this
route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.

 

Why not have SBI_PMU_COUNTER_SET_EVENT return a 32b hpmcounter bit mask when it fails (that - as Zong suggested - identifies hardware counters that do support the requested event)?  Then SBI_PMU_COUNTER_SET_EVENT doesn't need to change.  The caller can simply call it again with a different counter_idx that it knows should succeed.

 

Or have SBI_PMU_COUNTER_SET_EVENT take a boolean argument indicating whether to try and set up just the specified counter, or to examine all counters and try to pick one that supports the requested event?  Then no bit mask ever needs to be returned.

 

Also, in any case, I assume the SBI routine needs to take the 'mcounteren' CSR into account and only set up a counter that mcounteren makes available to lower privilege modes?  (And any returned bit mask would also reflect only counters that have their mcounteren bits set to '1'.)

 

Greg

 

How about this ??

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   It will find and configure a counter from a set of counters which can
   monitor specified event. The counter_idx_base and counter_idx_mask
   parameters represent the set of counters whereas the event_idx and
   event_info represent the event to monitor. Upon success the SBI call
   will return the counter_idx of the counter which has been configured
   to monitor specified event.  This SBI call will fail if it is unable to find
   a counter which can monitor specified event. It will also fail if the set of
   counters pointers specified via counter_idx_base and counter_idx_mask
   has an invalid counter.


Re: Proposal v3: SBI PMU Extension

Anup Patel
 

-----Original Message-----
From: tech-unixplatformspec@... <tech-
unixplatformspec@...> On Behalf Of Zong Li
Sent: 15 July 2020 13:36
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...; gfavor@...
Subject: Re: [RISC-V] [tech-unixplatformspec] Proposal v3: SBI PMU
Extension

On Wed, Jul 15, 2020 at 12:32 PM Anup Patel <Anup.Patel@...> wrote:



-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 15 July 2020 07:40
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...;
gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension

On Tue, Jul 14, 2020 at 11:40 AM Anup Patel <Anup.Patel@...>
wrote:



-----Original Message-----
From: Zong Li <zong.li@...>
Sent: 14 July 2020 09:02
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra
<Atish.Patra@...>; andrew@...;
gfavor@...
Subject: Re: Proposal v3: SBI PMU Extension

On Mon, Jul 13, 2020 at 9:12 PM Anup Patel <Anup.Patel@...>
wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension but we do have
HARDWARE
performance counters such as CYCLE CSR, INSTRET CSR, and
HPMCOUNTER CSRs. A RISC-V implementation can support
monitoring various HARDWARE events using limited number of
HPMCOUNTER
CSRs.

In addition to HARDWARE performance counters, a SBI
implementation (e.g. OpenSBI, Xvisor, KVM, etc) can provide
SOFTWARE counters for events such as number of RFENCEs,
number
of IPIs, number of misaligned load/store instructions, number of
illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or
VS-mode) software to discover and configure
HARDWARE/SOFTWARE counters.
The
SBI
PMU extension will only manage per-HART (or per-CPU)
HARDWARE/SOFTWARE
counters which include CYCLE CSR, INSTRET CSR, HPMCOUNTER
CSRs
and
SOFTWARE counters provided by SBI implementation.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM,
or
Xvisor) will provide a standardized view of HARDWARE/SOFTWARE
counters
and events to S-mode (or VS-mode) software.

To define SBI PMU extension, we first define counter_idx which
is a logical number assigned to a counter and event_idx which
is an encoded number representing the HARDWARE/SOFTWARE
event
to be
monitored.
A
HARDWARE/SOFTWARE event can also have additional
configuration/details
referred to as event_info.

The SBI PMU event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For
HARDWARE event, the event_info is not required whereas the
event_idx.code can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For
HARDWARE
CACHE event, the event_info is not required whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */ }; enum
sbi_pmu_hw_cache_op_id
{
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For
HARDWARE RAW event, the event_idx.code should be zero and the
event_info parameter passed to SBI_PMU_COUNTER_SET_EVENT
call
(described
below)
will have the RAW event value to be programmed in MHPMEVENT
CSR
(i.e.
the SBI implementation will not derive MHPMEVENT CSR value
from event_idx + event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For
SOFTWARE event, the event_info is not required whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking SBI
call compatibility of SBI calls.

Using definition of counter_idx and event_idx, we can
potentially have the following SBI calls:

1. SBI_PMU_NUM_COUNTERS
This call will return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
It will provide the CSR_Number and CSR_Width of underlying
counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits
implemented in
HW)
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it
is
HARDWARE counter. This SBI call will fail for counters which are not
present.

3. SBI_PMU_COUNTER_SET_EVENT
This call takes three parameter:
1) counter_idx
2) event_idx
3) event_info
It will select an event to be monitored by given counter. If this
SBI call is not used for a counter to select an event then the
counter will montior default event selected for it at boot-time.
This SBI call will fail for counters which are not present. It will
also fail if specified event_idx + event_info combination is not
supported by given counter.
It also seems to fail if the specified event is not supported by
the given counter, right? Then Linux driver could try to
allocate the next free counter when returning failure from this SBI
calls.

Yes, this call will fail if event_idx + event_info combination is
not supported by given counter_idx. It is expected that Linux
driver will try another free counter if SBI_PMU_COUNTER_SET_EVENT
call fails. I have suggested few ideas on how to reduce
SBI_PMU_COUNTER_SET_EVENT
calls by looking at CSR number assigned to counter.
Could it put the bitmap of counters which support the given event
into ret.value ($a1) if it fails for the given counter, then s-mode
software can conveniently find which next one is a good counter for
the event. But there is a constraint that the maximum number of
counters need to assume to be less than XLEN. Do you think it is feasible?
Yes, this is feasible and can further reduce SBI calls but if we go
this route then SBI_PMU_COUNTER_SET_EVENT name is not appropriate.

How about this ??

3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
It will find and configure a counter from a set of counters which can
monitor specified event. The counter_idx_base and counter_idx_mask
parameters represent the set of counters whereas the event_idx and
event_info represent the event to monitor. Upon success the SBI call
will return the counter_idx of the counter which has been configured
to monitor specified event. This SBI call will fail if it is unable to find
a counter which can monitor specified event. It will also fail if the set of
counters pointers specified via counter_idx_base and counter_idx_mask
has an invalid counter.
It looks like the m-mode firmware would be responsible for selecting a
suitable counter and taking the allocation work for s-mode software?
The allocation work of counters will still be owned by S-mode software.

Only selecting a counter from a set of available counters based on event
to be monitored will be done by SBI_PMU_COUNTER_CONFIG_MATCHING.

In fact, SBI_PMU_COUNTER_CONFIG_MATCHING is equivalent to
SBI_PMU_COUNTER_SET_EVENT if we pass "counter_idx_mask = 0x1"
and "counter_idx_base = counter_idx"

Could you elaborate or give a example to show how counter_idx_base and
counter_idx_mask represent the set of counters? It seems likes the set of
Let's say Linux RISC-V PMU driver is tracking free/available counters
for each HART using a per-HART bitmap. Now let's assume that on
HART X we have free counters 3,6,8,9,13,20, ... and some user-space
app creates perf_event on HART X.

In this case, with SBI_PMU_COUNTER_SET_EVENT call we will have to
try each available counter one-by-one to find counter that supports
required event_idx + event_info combination in add() callback of
Linux RISC-V PMU driver.

Instead of this, using SBI_PMU_COUNTER_CONFIG_MATCHING call
we let SBI implementation select a matching counter from a set of
available counters.

counters can be represented by one parameter, each bit is corresponding to
one counter_idx.
We have to consider both RV32 and RV64 here. On RV32, XLEN = 32 so
having only one parameter will limit us to 32 counters. This is too strict
for RV32 and we don't have any room for SOFTWARE counters.

Let's not limit number of counters by XLEN.

The counter_idx_base and counter_idx_mask approach is similar to
"hart_mask_base" and "hart_mask" parameters of SBI_SEND_IPI call.

Regards,
Anup





Apart from this question above, this version of the proposal is
great to
me.

Cool

Regards,
Anup


Thanks,
Zong


4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI
call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number > 0xfff).

5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.

6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified
counters
on the calling HART. This SBI call will fail for counters which are
not present.

The M-mode runtime firmware (OpenSBI) Development Notes:

1. The M-mode runtime firmware will have to translate SBI PMU
event_idx and event_into into platform dependent MHPMEVENT
CSR
value before starting/enabling a HARDWARE counter.

2. The M-mode runtime firmware (OpenSBI) will need to know
following
platform dependent information:
A) Possible event_idx values allowed (or supported) by a
HARDWARE
counter (i.e. HPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to
MHPMEVENT
CSR
value. This is optional for platform. By default, OpenSBI will
write a value <xyz> to MHPMEVENT CSR where lower 20bits
of
<xyz>
are event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-
20
bits of event_info
C) Additional platform-specific progamming required for selecting
event_idx + event_info combination. This is also optional for
platform.

3. All platform dependent information mentioned above, can be
obtained
by M-mode runtime firmware (OpenSBI) from platform specific
code.
The DT/ACPI can also be used to describe 2.A and 2.B
mentioned
above
but 2.C will always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,sbi-pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify an edge-
triggered
overflow interrupt for each HART. When "interrupts" DT property
is
present, we might also need another DT property for mapping
HARTID
to entries in "interrupts" DT property. The platform driver probe
will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of
arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS
call
C) Get CSR details of each counter using
SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory
location
of counter. The driver skip this in driver probe and instead
do this lazily in add() callback mentioned below.

2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum
sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum
sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to
perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by event_add())

3. add() callback
The add() callback of Linux RISC-V PMU driver will find a
free counter on current CPU/HART such that the perf_event
event_idx + event_info combination is supported by the counter.
To check-and-set event_idx + event_info combination for a
counter, we will use the SBI_PMU_COUNTER_SET_EVENT call.
The counter allocation and SBI_PMU_COUNTER_SET_EVENT call
can be futher optimized by looking at CSR details.
For example:
A) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_CPU_CYCLES, we should
prefer counter mapping to CYCLE CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
B) For event_idx.type == 0 and
event_idx.code == SBI_PMU_HW_INSTRUCTIONS, we should
prefer counter mapping to INSTRET CSR and skip doing
SBI_PMU_COUNTER_SET_EVENT call.
C) For event_idx == 0xf, only perfer counters mapping
to 0xfff CSR (i.e. SOFTWARE counters).

4. del() callback
The del() callback of Linux RISC-V PMU driver will release
or free the counter.

5. start() callback
The start() callback of Linux RISC-V PMU driver will start
the counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop
the counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: [RISC-V] [tech-config] Profiles and Config and Device Tree

Arun Thomas
 

[Adding Software and Platform]

 

I agree the Profiles TG will need to work closely with the Config and Platform TGs.

 

Mark, I think it might be helpful if you (and maybe Krste also) could create a short description of the high-level goals/use cases for Profiles for a future TSC/TC discussion. Profiles cut across many of the TGs, and I think many are still hazy on what’s involved. I expect we will have an easier time recruiting a Profiles TG Chair if the goals are clearly defined.

 

Best,

Arun

 

From: <tech-config@...> on behalf of mark <markhimelstein@...>
Reply-To: "tech-config@..." <tech-config@...>
Date: Wednesday, July 29, 2020 at 10:40 PM
To: config <tech-config@...>, RISC-V SW Dev <sw-dev@...>, Al Stone <ahs3@...>
Subject: [RISC-V] [tech-config] Profiles and Config and Device Tree

 

All,

 

I feel like we could be more coordinated in our efforts here.

 

From what I know Profiles will overlap in content with Config and with Device Tree (and friends -- using Device Tree to represent the class of configuration files that drive how we build SW from OS, Tool Chain, etc.).

 

Both Config and Device Tree must be consistent with the appropriate Profile. Profiles will have some information that Config won't need. Profiles will also have some information that Device Tree won't need. Both Config and Device Tree will contain information that does not have to be in a Profile.

 

All constituents must have representation in the Profiles TG. The Profiles TG will own driving Profiles to closure.

 

This may be obvious to some of you but it was not to me from the email chains. 

 

Does this make sense?

 

Thanks

Mark


Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to this email and immediately destroy all copies of this email.


Re: [RISC-V] [tech-config] Profiles and Config and Device Tree

mark
 

I suggest the charter include but not be limited to:

1. define the syntactic format for the content that is human readable. including things like macros etc.
2. define the globally unique record & field (or equivalent) definitions and semantics. this must include rules for org specific field and a clearinghouse for field and record names (or equivalent).
3. define the organization unique naming scheme for profiles themselves (e.g. RISC-V_identifier/URL/profile_name/version_number). also consider if a RISC-V_identifier (like "RISC-V_PROFILE") is part of the string and whether it has branding or trademark implications (with Kim the marketing director's help).
4. define upwards and backwards compatibility rules for different versions of a profile. for example you may end up saying that later profile versions can only be additive for a profile.
5. define the relationship between profiles and other descriptive items in the RISC-V ecosystem including but not limited to: config, ABI and other *BIs, OS config files like device tree, unpriv spec chapter 27 RV strings. provide guidance on how to generate profile relevant portions of those other descriptions from a profile (whether it is manual or generated).
6. define the name and structure of a profile file or files. for structure an example, is it one big file or can it be a bunch of small files that might allow an umbrella file to include them )even nested) for manageability and to allow multiple profiles to share common portions even across orgs. is it tool compatible with something that will turn it into one big file like cpp? can it be built? will it be built every night and be made available? if so, will all versions of a the big file be made available in combined form somewhere?
7. define where profiles are stored and accessible: github? riscv.org website? multiple places? something different? are all orgs required to post their profiles or just RISC-V. org specific profiles may affect the profile file names so that the profile file URLs has some relationship to the URL (if it is included) from the profile name above.
8. define the copyright requirements.
9. define the rules for organizations other than RISC-V to create, potentially hide, publish, and maintain their own profiles potentially including the definition of the types of profiles they can create. for examples of types: they could have a derived profile meaning it only adds things to a riscv.org profile or they may create a custom profile which changes field values from some base riscv.org profile
10. specify the relationship between profiles and app or os compatibility. for example if you adopt and comply with the riscv.org linux profile or create a derived profile from it (see sample definition of derived above) then you are guaranteed that all linux app (and libs) will work on your product. one ring to bind them instead of having to say all the other identifiers list above (see compliance item below).
11. what does profile compliance or compatible mean? what is the relationship between, for example, RV or *BI compliance and profile compliance.
12. create the riscv.org base profiles: e.g. linux, rtos, bare metal. make sure they can manually or automatically generate the portions of the other descriptions (e.g. device tree) as listed above.

I am sure I missed many things and got some things wrong and I suggest we need Krste to weigh in. I derived the above from my discussions from him and my own thoughts, but it is his concept.

I also expect the TG will correct and add many things.

Is this what you were looking for?

Mark

On Thu, Jul 30, 2020 at 5:47 AM <arun@...> wrote:

[Adding Software and Platform]

 

I agree the Profiles TG will need to work closely with the Config and Platform TGs.

 

Mark, I think it might be helpful if you (and maybe Krste also) could create a short description of the high-level goals/use cases for Profiles for a future TSC/TC discussion. Profiles cut across many of the TGs, and I think many are still hazy on what’s involved. I expect we will have an easier time recruiting a Profiles TG Chair if the goals are clearly defined.

 

Best,

Arun

 

From: <tech-config@...> on behalf of mark <markhimelstein@...>
Reply-To: "tech-config@..." <tech-config@...>
Date: Wednesday, July 29, 2020 at 10:40 PM
To: config <tech-config@...>, RISC-V SW Dev <sw-dev@...>, Al Stone <ahs3@...>
Subject: [RISC-V] [tech-config] Profiles and Config and Device Tree

 

All,

 

I feel like we could be more coordinated in our efforts here.

 

From what I know Profiles will overlap in content with Config and with Device Tree (and friends -- using Device Tree to represent the class of configuration files that drive how we build SW from OS, Tool Chain, etc.).

 

Both Config and Device Tree must be consistent with the appropriate Profile. Profiles will have some information that Config won't need. Profiles will also have some information that Device Tree won't need. Both Config and Device Tree will contain information that does not have to be in a Profile.

 

All constituents must have representation in the Profiles TG. The Profiles TG will own driving Profiles to closure.

 

This may be obvious to some of you but it was not to me from the email chains. 

 

Does this make sense?

 

Thanks

Mark


Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to this email and immediately destroy all copies of this email.

--
Mark I Himelstein
CTO RISC-V International
+1-408-250-6611
twitter @mark_riscv


Re: [RISC-V] [software] [RISC-V] [tech-config] Profiles and Config and Device Tree

Arun Thomas
 

Thanks, Mark. This is very helpful.

 

Krste, it would be good to get your take on Profiles as well.

 

Arun

From: <software@...> on behalf of mark <markhimelstein@...>
Date: Thursday, July 30, 2020 at 10:23 AM
To: "tech-config@..." <tech-config@...>
Cc: Al Stone <ahs3@...>, "software@..." <software@...>, "tech-unixplatformspec@..." <tech-unixplatformspec@...>
Subject: Re: [RISC-V] [software] [RISC-V] [tech-config] Profiles and Config and Device Tree

 

I suggest the charter include but not be limited to:

 

1. define the syntactic format for the content that is human readable. including things like macros etc.

2. define the globally unique record & field (or equivalent) definitions and semantics. this must include rules for org specific field and a clearinghouse for field and record names (or equivalent).

3. define the organization unique naming scheme for profiles themselves (e.g. RISC-V_identifier/URL/profile_name/version_number). also consider if a RISC-V_identifier (like "RISC-V_PROFILE") is part of the string and whether it has branding or trademark implications (with Kim the marketing director's help).

4. define upwards and backwards compatibility rules for different versions of a profile. for example you may end up saying that later profile versions can only be additive for a profile.

5. define the relationship between profiles and other descriptive items in the RISC-V ecosystem including but not limited to: config, ABI and other *BIs, OS config files like device tree, unpriv spec chapter 27 RV strings. provide guidance on how to generate profile relevant portions of those other descriptions from a profile (whether it is manual or generated).

6. define the name and structure of a profile file or files. for structure an example, is it one big file or can it be a bunch of small files that might allow an umbrella file to include them )even nested) for manageability and to allow multiple profiles to share common portions even across orgs. is it tool compatible with something that will turn it into one big file like cpp? can it be built? will it be built every night and be made available? if so, will all versions of a the big file be made available in combined form somewhere?

7. define where profiles are stored and accessible: github? riscv.org website? multiple places? something different? are all orgs required to post their profiles or just RISC-V. org specific profiles may affect the profile file names so that the profile file URLs has some relationship to the URL (if it is included) from the profile name above.

8. define the copyright requirements.

9. define the rules for organizations other than RISC-V to create, potentially hide, publish, and maintain their own profiles potentially including the definition of the types of profiles they can create. for examples of types: they could have a derived profile meaning it only adds things to a riscv.org profile or they may create a custom profile which changes field values from some base riscv.org profile

10. specify the relationship between profiles and app or os compatibility. for example if you adopt and comply with the riscv.org linux profile or create a derived profile from it (see sample definition of derived above) then you are guaranteed that all linux app (and libs) will work on your product. one ring to bind them instead of having to say all the other identifiers list above (see compliance item below).

11. what does profile compliance or compatible mean? what is the relationship between, for example, RV or *BI compliance and profile compliance.

12. create the riscv.org base profiles: e.g. linux, rtos, bare metal. make sure they can manually or automatically generate the portions of the other descriptions (e.g. device tree) as listed above.

 

I am sure I missed many things and got some things wrong and I suggest we need Krste to weigh in. I derived the above from my discussions from him and my own thoughts, but it is his concept.

 

I also expect the TG will correct and add many things.

 

Is this what you were looking for?

 

Mark

 

On Thu, Jul 30, 2020 at 5:47 AM <arun@...> wrote:

[Adding Software and Platform]

 

I agree the Profiles TG will need to work closely with the Config and Platform TGs.

 

Mark, I think it might be helpful if you (and maybe Krste also) could create a short description of the high-level goals/use cases for Profiles for a future TSC/TC discussion. Profiles cut across many of the TGs, and I think many are still hazy on what’s involved. I expect we will have an easier time recruiting a Profiles TG Chair if the goals are clearly defined.

 

Best,

Arun

 

From: <tech-config@...> on behalf of mark <markhimelstein@...>
Reply-To: "tech-config@..." <tech-config@...>
Date: Wednesday, July 29, 2020 at 10:40 PM
To: config <tech-config@...>, RISC-V SW Dev <sw-dev@...>, Al Stone <ahs3@...>
Subject: [RISC-V] [tech-config] Profiles and Config and Device Tree

 

All,

 

I feel like we could be more coordinated in our efforts here.

 

From what I know Profiles will overlap in content with Config and with Device Tree (and friends -- using Device Tree to represent the class of configuration files that drive how we build SW from OS, Tool Chain, etc.).

 

Both Config and Device Tree must be consistent with the appropriate Profile. Profiles will have some information that Config won't need. Profiles will also have some information that Device Tree won't need. Both Config and Device Tree will contain information that does not have to be in a Profile.

 

All constituents must have representation in the Profiles TG. The Profiles TG will own driving Profiles to closure.

 

This may be obvious to some of you but it was not to me from the email chains. 

 

Does this make sense?

 

Thanks

Mark


Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to this email and immediately destroy all copies of this email.


--

Mark I Himelstein
CTO RISC-V International
+1-408-250-6611
twitter @mark_riscv


Notice: This email and any attachments may contain proprietary (Draper non-public) and/or export-controlled information of Draper. If you are not the intended recipient of this email, please immediately notify the sender by replying to this email and immediately destroy all copies of this email.


I'm Resigning

Palmer Dabbelt
 

I'm resigning from my posts at the RISC-V Foundation (vice chair of the
software standing committee and chair of the UNIX platform specification
working group). I know it might seem a bit sudden, but I've though about this
for a while and this just isn't my sort of thing -- I ended up in these
positions as a result of working on the open source software side of things,
and I think I just fit in better over there. With any luck this should let me
spend more time on the software side of things, so long term I think it will be
better for everyone.

We don't have any pressing business in either of the groups, so I'm just going
to skip the lame duck period by making this effective immediately.


Re: I'm Resigning

mark
 

Palmer,

We are sad to see you go and you are always welcome back.

I know I speak for everyone in thanking you for all you have contributed to RISC-V SW SC and Unux Platform TG. I know we have only worked together for a short period of time but I am grateful for your honesty and knowledgeable advice.

Warmest regards and good luck in what you are pursuing.

Thank you again,
Mark



On Tue, Aug 4, 2020 at 2:50 PM Palmer Dabbelt <palmer@...> wrote:
I'm resigning from my posts at the RISC-V Foundation (vice chair of the
software standing committee and chair of the UNIX platform specification
working group).  I know it might seem a bit sudden, but I've though about this
for a while and this just isn't my sort of thing -- I ended up in these
positions as a result of working on the open source software side of things,
and I think I just fit in better over there.  With any luck this should let me
spend more time on the software side of things, so long term I think it will be
better for everyone.

We don't have any pressing business in either of the groups, so I'm just going
to skip the lame duck period by making this effective immediately.



--
Mark I Himelstein
CTO RISC-V International
+1-408-250-6611
twitter @mark_riscv


Re: [RISC-V] [TSC] [RISC-V] [tech-unixplatformspec] I'm Resigning

Stephano Cetola <scetola@...>
 

Indeed, it was great working with you, and you will be missed.

Thank you for all your hard work and know you’re always welcome in the RISC-V community!

All the best,
Stephano
--
Stephano Cetola
Program Manager, The Linux Foundation

On Aug 4, 2020, at 6:33 PM, mark <markhimelstein@...> wrote:

Palmer,

We are sad to see you go and you are always welcome back.

I know I speak for everyone in thanking you for all you have contributed to RISC-V SW SC and Unux Platform TG. I know we have only worked together for a short period of time but I am grateful for your honesty and knowledgeable advice.

Warmest regards and good luck in what you are pursuing.

Thank you again,
Mark



On Tue, Aug 4, 2020 at 2:50 PM Palmer Dabbelt <palmer@...> wrote:
I'm resigning from my posts at the RISC-V Foundation (vice chair of the
software standing committee and chair of the UNIX platform specification
working group). I know it might seem a bit sudden, but I've though about this
for a while and this just isn't my sort of thing -- I ended up in these
positions as a result of working on the open source software side of things,
and I think I just fit in better over there. With any luck this should let me
spend more time on the software side of things, so long term I think it will be
better for everyone.

We don't have any pressing business in either of the groups, so I'm just going
to skip the lame duck period by making this effective immediately.



--
Mark I Himelstein
CTO RISC-V International
+1-408-250-6611
twitter @mark_riscv


Proposal v4: SBI PMU Extension

Anup Patel
 

Hi All,

We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.

Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.

The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
SBI_PMU_HW_CPU_CYCLES = 0,
SBI_PMU_HW_INSTRUCTIONS = 1,
SBI_PMU_HW_CACHE_REFERENCES = 2,
SBI_PMU_HW_CACHE_MISSES = 3,
SBI_PMU_HW_BRANCH_INSTRUCTIONS = 4,
SBI_PMU_HW_BRANCH_MISSES = 5,
SBI_PMU_HW_BUS_CYCLES = 6,
SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
SBI_PMU_HW_STALLED_CYCLES_BACKEND = 8,
SBI_PMU_HW_REF_CPU_CYCLES = 9,
SBI_PMU_HW_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
SBI_PMU_HW_CACHE_L1D = 0,
SBI_PMU_HW_CACHE_L1I = 1,
SBI_PMU_HW_CACHE_LL = 2,
SBI_PMU_HW_CACHE_DTLB = 3,
SBI_PMU_HW_CACHE_ITLB = 4,
SBI_PMU_HW_CACHE_BPU = 5,
SBI_PMU_HW_CACHE_NODE = 6,
SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
SBI_PMU_HW_CACHE_OP_READ = 0,
SBI_PMU_HW_CACHE_OP_WRITE = 1,
SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
SBI_PMU_HW_CACHE_OP_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
SBI_PMU_HW_CACHE_RESULT_MISS = 1,
SBI_PMU_HW_CACHE_RESULT_MAX, /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
SBI_PMU_SW_MISALIGNED_LOAD = 0,
SBI_PMU_SW_MISALIGNED_STORE = 1,
SBI_PMU_SW_ILLEGAL_INSN = 2,
SBI_PMU_SW_LOCAL_SET_TIMER = 3,
SBI_PMU_SW_LOCAL_IPI = 4,
SBI_PMU_SW_LOCAL_FENCE_I = 5,
SBI_PMU_SW_LOCAL_SFENCE_VMA = 6,
SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID = 7,
SBI_PMU_SW_LOCAL_HFENCE_GVMA = 8,
SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
SBI_PMU_SW_LOCAL_HFENCE_VVMA = 10,
SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
SBI_PMU_SW_MAX, /* non-ABI */
};

In future, more events can be defined without breaking SBI call
backward-compatibility.

Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:

1. SBI_PMU_NUM_COUNTERS
Return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
This call takes one parameter:
1) counter_idx
Provide the CSR_Number and CSR_Width of underlying counter.
The value returned by SBI call is encoded as follows:
return_value[11:0] = CSR_Number
return_value[19:12] = CSR_Width (Number of bits implemented in HW)
return_value[XLEN-1:20] = Reserved
If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
HARDWARE counter. This SBI call will fail for counters which are not
present.

3. SBI_PMU_COUNTER_CONFIG_MATCHING
This call takes three parameter:
1) counter_idx_base
2) counter_idx_mask
3) event_idx
4) event_info
Find and configure a counter from a set of counters which can monitor
specified event. The counter_idx_base and counter_idx_mask parameters
represent the set of counters whereas the event_idx and event_info
represent the event to monitor. Upon success the SBI call will return
the counter_idx of the counter which has been configured to monitor
specified event. This SBI call will fail if it is unable to find a
counter which can monitor specified event or the set of counters
specified via counter_idx_base and counter_idx_mask has an invalid
counter.

4. SBI_PMU_COUNTER_SET_PHYS_ADDR
This call takes two parameters:
1) counter_idx
2) 8byte aligned physical address
It will set the physical address of memory location where the SBI
implementation will write the 64bit SOFTWARE counter. This SBI call
is only for counters not mapped to any CSR (i.e. only for counters
with CSR_Number == 0xfff).

5. SBI_PMU_COUNTER_START
This call takes two parameters:
1) counter_idx
2) initial_value
It will inform SBI implementation to start/enable specified counter
with specified initial value. This SBI call will fail for counters
which are not present.

6. SBI_PMU_COUNTER_STOP
This call takes one parameter:
1) counter_idx
It will inform SBI implementation to stop/disable specified counters
on the calling HART. This SBI call will fail for counters which are
not present.

The OpenSBI (M-mode runtime firmware) Development Notes:

1. The OpenSBI firmware will translate event_idx and event_into into
platform dependent MHPMEVENT CSR value before starting/enabling a
HARDWARE counter.

2. The OpenSBI firmware will need to know following platform dependent
information:
A) Possible event_idx values allowed (or supported) by a HARDWARE
counter (i.e. MHPMCOUNTER)
B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
value. This is optional and by default OpenSBI will write a value
<xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
C) Additional platform-specific programming required for selecting
event_idx + event_info combination is also optional for platform.

3. All platform dependent information mentioned above, can be obtained
by OpenSBI firmware from platform specific code. The DT/ACPI can
also be used to describe 2.A and 2.B mentioned above but 2.C will
always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
The Linux RISC-V driver can be platform driver with "riscv,pmu"
as DT compatible string and optional "interrupts" DT property. The
"interrupts" DT property if available should specify overflow
interrupt for each HART. When "interrupts" DT property is present,
we might also need another DT property for mapping HARTID to entries
in "interrupts" DT property. The platform driver probe will:
A) Need to ensure that underlying SBI implementation provides
SBI PMU extension using sbi_probe_extension() API of arch/riscv.
B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
call. If the counter is a SOFTWARE counter then use the
SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
of counter. The driver can skip this in driver probe and
instead do this lazily in add() callback mentioned below.

2. event_init() callback
The event_init() callback will primarily translate user-space
perf_event_attr to SBI PMU event_idx and event_info. It can do
this in following way:
A) perf_event_attr.type == PERF_TYPE_HARDWARE
event_idx.type = 0x0
event_idx.code = Value from enum sbi_pmu_hw_id based on
perf_event_attr.config
event_info = 0
B) perf_event_attr.type == PERF_TYPE_HW_CACHE
event_idx.type = 0x1
event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
based on perf_event_attr.config
event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
based on perf_event_attr.config
event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
based on perf_event_attr.config
event_info = 0
C) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 0
event_idx.type = 0x2
event_idx.code = 0x0
event_info = perf_event_attr.config[62:0]
D) perf_event_attr.type == PERF_TYPE_RAW and
perf_event_attr.config[63:63] == 1
event_idx.type = 0xf
event_idx.code = Value from enum sbi_pmu_sw_id based on
perf_event_attr.config
event_info = 0
(Note: event_init() will fail if it is not able to figure out
event_idx and event_info value corresponding to perf_event_attr)
(Note: event_init() will not assign counter to perf_event because
it will be done by add() callback)

3. add() callback
The add() callback of Linux RISC-V PMU driver will find a free
counter on current CPU/HART such that the event_idx and event_info
combination is supported by the counter. To find-and-configure
a counter to monitor event_idx and event_info combination from
a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
call.

4. del() callback
The del() callback of Linux RISC-V PMU driver will release or
free the counter.

5. start() callback
The start() callback of Linux RISC-V PMU driver will start the
counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
The stop() callback of Linux RISC-V PMU driver will stop the
counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: Proposal v4: SBI PMU Extension

Greg Favor
 

Anup,

What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?

Greg

On Thu, Aug 6, 2020 at 6:52 AM Anup Patel <Anup.Patel@...> wrote:
Hi All,

We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.

Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.

The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
    SBI_PMU_HW_CPU_CYCLES              = 0,
    SBI_PMU_HW_INSTRUCTIONS            = 1,
    SBI_PMU_HW_CACHE_REFERENCES        = 2,
    SBI_PMU_HW_CACHE_MISSES            = 3,
    SBI_PMU_HW_BRANCH_INSTRUCTIONS     = 4,
    SBI_PMU_HW_BRANCH_MISSES           = 5,
    SBI_PMU_HW_BUS_CYCLES              = 6,
    SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
    SBI_PMU_HW_STALLED_CYCLES_BACKEND  = 8,
    SBI_PMU_HW_REF_CPU_CYCLES          = 9,
    SBI_PMU_HW_MAX,                    /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    SBI_PMU_HW_CACHE_L1D  = 0,
    SBI_PMU_HW_CACHE_L1I  = 1,
    SBI_PMU_HW_CACHE_LL   = 2,
    SBI_PMU_HW_CACHE_DTLB = 3,
    SBI_PMU_HW_CACHE_ITLB = 4,
    SBI_PMU_HW_CACHE_BPU  = 5,
    SBI_PMU_HW_CACHE_NODE = 6,
    SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    SBI_PMU_HW_CACHE_OP_READ     = 0,
    SBI_PMU_HW_CACHE_OP_WRITE    = 1,
    SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
    SBI_PMU_HW_CACHE_OP_MAX,     /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
    SBI_PMU_HW_CACHE_RESULT_MISS   = 1,
    SBI_PMU_HW_CACHE_RESULT_MAX,   /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
    SBI_PMU_SW_MISALIGNED_LOAD        = 0,
    SBI_PMU_SW_MISALIGNED_STORE       = 1,
    SBI_PMU_SW_ILLEGAL_INSN           = 2,
    SBI_PMU_SW_LOCAL_SET_TIMER        = 3,
    SBI_PMU_SW_LOCAL_IPI              = 4,
    SBI_PMU_SW_LOCAL_FENCE_I          = 5,
    SBI_PMU_SW_LOCAL_SFENCE_VMA       = 6,
    SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID  = 7,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA      = 8,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA      = 10,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
    SBI_PMU_SW_MAX,                   /* non-ABI */
};

In future, more events can be defined without breaking SBI call
backward-compatibility.

Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   Return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
   This call takes one parameter:
      1) counter_idx
   Provide the CSR_Number and CSR_Width of underlying counter.
   The value returned by SBI call is encoded as follows:
      return_value[11:0] = CSR_Number
      return_value[19:12] = CSR_Width (Number of bits implemented in HW)
          return_value[XLEN-1:20] = Reserved
   If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
   HARDWARE counter. This SBI call will fail for counters which are not
   present.

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   Find and configure a counter from a set of counters which can monitor
   specified event. The counter_idx_base and counter_idx_mask parameters
   represent the set of counters whereas the event_idx and event_info
   represent the event to monitor. Upon success the SBI call will return
   the counter_idx of the counter which has been configured to monitor
   specified event.  This SBI call will fail if it is unable to find a
   counter which can monitor specified event or the set of counters
   specified via counter_idx_base and counter_idx_mask has an invalid
   counter.

4. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters:
      1) counter_idx
      2) 8byte aligned physical address
   It will set the physical address of memory location where the SBI
   implementation will write the 64bit SOFTWARE counter. This SBI call
   is only for counters not mapped to any CSR (i.e. only for counters
   with CSR_Number == 0xfff).

5. SBI_PMU_COUNTER_START
   This call takes two parameters:
      1) counter_idx
      2) initial_value
   It will inform SBI implementation to start/enable specified counter
   with specified initial value. This SBI call will fail for counters
   which are not present.

6. SBI_PMU_COUNTER_STOP
   This call takes one parameter:
      1) counter_idx
   It will inform SBI implementation to stop/disable specified counters
   on the calling HART. This SBI call will fail for counters which are
   not present.

The OpenSBI (M-mode runtime firmware) Development Notes:

1. The OpenSBI firmware will translate event_idx and event_into into
   platform dependent MHPMEVENT CSR value before starting/enabling a
   HARDWARE counter.

2. The OpenSBI firmware will need to know following platform dependent
   information:
   A) Possible event_idx values allowed (or supported) by a HARDWARE
      counter (i.e. MHPMCOUNTER)
   B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
      value. This is optional and by default OpenSBI will write a value
          <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
          and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
   C) Additional platform-specific programming required for selecting
      event_idx + event_info combination is also optional for platform.

3. All platform dependent information mentioned above, can be obtained
   by OpenSBI firmware from platform specific code. The DT/ACPI can
   also be used to describe 2.A and 2.B mentioned above but 2.C will
   always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
   The Linux RISC-V driver can be platform driver with "riscv,pmu"
   as DT compatible string and optional "interrupts" DT property. The
   "interrupts" DT property if available should specify overflow
   interrupt for each HART. When "interrupts" DT property is present,
   we might also need another DT property for mapping HARTID to entries
   in "interrupts" DT property. The platform driver probe will:
   A) Need to ensure that underlying SBI implementation provides
      SBI PMU extension using sbi_probe_extension() API of arch/riscv.
   B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
   C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
      call. If the counter is a SOFTWARE counter then use the
          SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
      of counter. The driver can skip this in driver probe and
          instead do this lazily in add() callback mentioned below.

2. event_init() callback
   The event_init() callback will primarily translate user-space
   perf_event_attr to SBI PMU event_idx and event_info. It can do
   this in following way:
   A) perf_event_attr.type == PERF_TYPE_HARDWARE
      event_idx.type = 0x0
      event_idx.code = Value from enum sbi_pmu_hw_id based on
                           perf_event_attr.config
      event_info = 0
   B) perf_event_attr.type == PERF_TYPE_HW_CACHE
      event_idx.type = 0x1
      event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
                                    based on perf_event_attr.config
      event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
                                 based on perf_event_attr.config
      event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
                                     based on perf_event_attr.config
      event_info = 0
   C) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 0
      event_idx.type = 0x2
          event_idx.code = 0x0
          event_info = perf_event_attr.config[62:0]
   D) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 1
      event_idx.type = 0xf
          event_idx.code = Value from enum sbi_pmu_sw_id based on
                           perf_event_attr.config
          event_info = 0
   (Note: event_init() will fail if it is not able to figure out
    event_idx and event_info value corresponding to perf_event_attr)
   (Note: event_init() will not assign counter to perf_event because
    it will be done by add() callback)

3. add() callback
   The add() callback of Linux RISC-V PMU driver will find a free
   counter on current CPU/HART such that the event_idx and event_info
   combination is supported by the counter. To find-and-configure
   a counter to monitor event_idx and event_info combination from
   a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
   call.

4. del() callback
   The del() callback of Linux RISC-V PMU driver will release or
   free the counter.

5. start() callback
   The start() callback of Linux RISC-V PMU driver will start the
   counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
   The stop() callback of Linux RISC-V PMU driver will stop the
   counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: Proposal v4: SBI PMU Extension

Jonathan Behrens <behrensj@...>
 

I like this proposal! A couple comments...

In a couple places you say "the event_info is optional and can be zero". Does this mean that SBI providers must ignore the field, or that non-zero values are reserved (meaning software must set it to zero), or that other values have SBI implementation specific semantics? Or something else?

The OpenSBI firmware will need to know following platform dependent information: [...]

    B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR value. This is optional and by default OpenSBI will write a value <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info

This seems to contradict the previous point. By saying how OpenSBI is going to use the event_info field, you are effectively requiring that the OS properly set it. I'd rather the SBI provider just return an error if it can't figure out the proper mappings to MHPMEVENT CSR values. That way there is no risk that the S-mode software accidentally ends up tracking the wrong performance counter because it didn't know what to set event_info to. Put another way, the SBI provider is supposed to be the one that deals with platform specific issues, so the operating system doesn't have to.

It would also be nice if possible to pin down how S-mode software can learn the meanings of raw counters. Is it sufficient to look at the tuple of (mvendorid, marchid, mimpid)? Even just some commentary text with guidance could be helpful here.

Finally, I wanted to ask about the SBI_PMU_COUNTER_SET_PHYS_ADDR function. Apologies if this has been answered already, but I think this might not work well with the enhanced PMP proposal that is designed to allow most of DRAM to be marked as S/U-mode only. The proposal allows regions to be shared between M-mode and S/U-mode but presumably an implementation would prefer to require only a single shared region with all counters instead of needing to use NUM_COUNTERS number of PMP entries. This could be enabled by making the interface be SBI_PMU_COUNTER_GET_PHYS_ADDR so the firmware gets to pick the locations. On this front, another thing to watch is the memory attributes proposals coming out of the virtual memory task group: shared mappings might have performance costs (to avoid issues with mismatches between M-mode and S-mode attributes).

Jonathan

On Thu, Aug 6, 2020 at 9:52 AM Anup Patel via lists.riscv.org <anup.patel=wdc.com@...> wrote:
Hi All,

We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.

Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.

The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
    SBI_PMU_HW_CPU_CYCLES              = 0,
    SBI_PMU_HW_INSTRUCTIONS            = 1,
    SBI_PMU_HW_CACHE_REFERENCES        = 2,
    SBI_PMU_HW_CACHE_MISSES            = 3,
    SBI_PMU_HW_BRANCH_INSTRUCTIONS     = 4,
    SBI_PMU_HW_BRANCH_MISSES           = 5,
    SBI_PMU_HW_BUS_CYCLES              = 6,
    SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
    SBI_PMU_HW_STALLED_CYCLES_BACKEND  = 8,
    SBI_PMU_HW_REF_CPU_CYCLES          = 9,
    SBI_PMU_HW_MAX,                    /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    SBI_PMU_HW_CACHE_L1D  = 0,
    SBI_PMU_HW_CACHE_L1I  = 1,
    SBI_PMU_HW_CACHE_LL   = 2,
    SBI_PMU_HW_CACHE_DTLB = 3,
    SBI_PMU_HW_CACHE_ITLB = 4,
    SBI_PMU_HW_CACHE_BPU  = 5,
    SBI_PMU_HW_CACHE_NODE = 6,
    SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    SBI_PMU_HW_CACHE_OP_READ     = 0,
    SBI_PMU_HW_CACHE_OP_WRITE    = 1,
    SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
    SBI_PMU_HW_CACHE_OP_MAX,     /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
    SBI_PMU_HW_CACHE_RESULT_MISS   = 1,
    SBI_PMU_HW_CACHE_RESULT_MAX,   /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
    SBI_PMU_SW_MISALIGNED_LOAD        = 0,
    SBI_PMU_SW_MISALIGNED_STORE       = 1,
    SBI_PMU_SW_ILLEGAL_INSN           = 2,
    SBI_PMU_SW_LOCAL_SET_TIMER        = 3,
    SBI_PMU_SW_LOCAL_IPI              = 4,
    SBI_PMU_SW_LOCAL_FENCE_I          = 5,
    SBI_PMU_SW_LOCAL_SFENCE_VMA       = 6,
    SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID  = 7,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA      = 8,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA      = 10,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
    SBI_PMU_SW_MAX,                   /* non-ABI */
};

In future, more events can be defined without breaking SBI call
backward-compatibility.

Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   Return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
   This call takes one parameter:
      1) counter_idx
   Provide the CSR_Number and CSR_Width of underlying counter.
   The value returned by SBI call is encoded as follows:
      return_value[11:0] = CSR_Number
      return_value[19:12] = CSR_Width (Number of bits implemented in HW)
          return_value[XLEN-1:20] = Reserved
   If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
   HARDWARE counter. This SBI call will fail for counters which are not
   present.

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   Find and configure a counter from a set of counters which can monitor
   specified event. The counter_idx_base and counter_idx_mask parameters
   represent the set of counters whereas the event_idx and event_info
   represent the event to monitor. Upon success the SBI call will return
   the counter_idx of the counter which has been configured to monitor
   specified event.  This SBI call will fail if it is unable to find a
   counter which can monitor specified event or the set of counters
   specified via counter_idx_base and counter_idx_mask has an invalid
   counter.

4. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters:
      1) counter_idx
      2) 8byte aligned physical address
   It will set the physical address of memory location where the SBI
   implementation will write the 64bit SOFTWARE counter. This SBI call
   is only for counters not mapped to any CSR (i.e. only for counters
   with CSR_Number == 0xfff).

5. SBI_PMU_COUNTER_START
   This call takes two parameters:
      1) counter_idx
      2) initial_value
   It will inform SBI implementation to start/enable specified counter
   with specified initial value. This SBI call will fail for counters
   which are not present.

6. SBI_PMU_COUNTER_STOP
   This call takes one parameter:
      1) counter_idx
   It will inform SBI implementation to stop/disable specified counters
   on the calling HART. This SBI call will fail for counters which are
   not present.

The OpenSBI (M-mode runtime firmware) Development Notes:

1. The OpenSBI firmware will translate event_idx and event_into into
   platform dependent MHPMEVENT CSR value before starting/enabling a
   HARDWARE counter.

2. The OpenSBI firmware will need to know following platform dependent
   information:
   A) Possible event_idx values allowed (or supported) by a HARDWARE
      counter (i.e. MHPMCOUNTER)
   B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
      value. This is optional and by default OpenSBI will write a value
          <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
          and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
   C) Additional platform-specific programming required for selecting
      event_idx + event_info combination is also optional for platform.

3. All platform dependent information mentioned above, can be obtained
   by OpenSBI firmware from platform specific code. The DT/ACPI can
   also be used to describe 2.A and 2.B mentioned above but 2.C will
   always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
   The Linux RISC-V driver can be platform driver with "riscv,pmu"
   as DT compatible string and optional "interrupts" DT property. The
   "interrupts" DT property if available should specify overflow
   interrupt for each HART. When "interrupts" DT property is present,
   we might also need another DT property for mapping HARTID to entries
   in "interrupts" DT property. The platform driver probe will:
   A) Need to ensure that underlying SBI implementation provides
      SBI PMU extension using sbi_probe_extension() API of arch/riscv.
   B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
   C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
      call. If the counter is a SOFTWARE counter then use the
          SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
      of counter. The driver can skip this in driver probe and
          instead do this lazily in add() callback mentioned below.

2. event_init() callback
   The event_init() callback will primarily translate user-space
   perf_event_attr to SBI PMU event_idx and event_info. It can do
   this in following way:
   A) perf_event_attr.type == PERF_TYPE_HARDWARE
      event_idx.type = 0x0
      event_idx.code = Value from enum sbi_pmu_hw_id based on
                           perf_event_attr.config
      event_info = 0
   B) perf_event_attr.type == PERF_TYPE_HW_CACHE
      event_idx.type = 0x1
      event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
                                    based on perf_event_attr.config
      event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
                                 based on perf_event_attr.config
      event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
                                     based on perf_event_attr.config
      event_info = 0
   C) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 0
      event_idx.type = 0x2
          event_idx.code = 0x0
          event_info = perf_event_attr.config[62:0]
   D) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 1
      event_idx.type = 0xf
          event_idx.code = Value from enum sbi_pmu_sw_id based on
                           perf_event_attr.config
          event_info = 0
   (Note: event_init() will fail if it is not able to figure out
    event_idx and event_info value corresponding to perf_event_attr)
   (Note: event_init() will not assign counter to perf_event because
    it will be done by add() callback)

3. add() callback
   The add() callback of Linux RISC-V PMU driver will find a free
   counter on current CPU/HART such that the event_idx and event_info
   combination is supported by the counter. To find-and-configure
   a counter to monitor event_idx and event_info combination from
   a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
   call.

4. del() callback
   The del() callback of Linux RISC-V PMU driver will release or
   free the counter.

5. start() callback
   The start() callback of Linux RISC-V PMU driver will start the
   counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
   The stop() callback of Linux RISC-V PMU driver will stop the
   counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup




Re: Proposal v4: SBI PMU Extension

Greg Favor
 

On Thu, Aug 6, 2020 at 11:36 AM Jonathan Behrens <behrensj@...> wrote:
It would also be nice if possible to pin down how S-mode software can learn the meanings of raw counters. Is it sufficient to look at the tuple of (mvendorid, marchid, mimpid)? Even just some commentary text with guidance could be helpful here.

But by definition RAW events are all implementation-specific (unless or until an arch extension standardizes a set of RAW events and their encodings).  At best, the software discovery method that the tech-config TG has started developing, could maybe be used to provide this implementation-specific information.  (Or unstandardized code could do its own discovery based on looking at things like mvendorid/marchid/mimpid.)
 
Finally, I wanted to ask about the SBI_PMU_COUNTER_SET_PHYS_ADDR function. Apologies if this has been answered already, but I think this might not work well with the enhanced PMP proposal that is designed to allow most of DRAM to be marked as S/U-mode only.

Interesting point.  Requiring M mode to have access to most S/U mode memory would defeat a lot of the purpose and security benefits of Enhanced PMP (aka PMPv2).
 
The proposal allows regions to be shared between M-mode and S/U-mode but presumably an implementation would prefer to require only a single shared region with all counters instead of needing to use NUM_COUNTERS number of PMP entries. This could be enabled by making the interface be SBI_PMU_COUNTER_GET_PHYS_ADDR so the firmware gets to pick the locations.

A key question is, in any case, who allocates the memory where software counters are placed?  How is that memory allocated in coordination with the OS or hypervisor?

Shouldn't the OS or hypervisor do the allocation and then tell M-mode the address of that block of memory?  Then M-mode can allocate space for individual counters from that.  And, in the context of PMPv2, that block of memory would be allocated from an existing "shared" PMP region.
 
On this front, another thing to watch is the memory attributes proposals coming out of the virtual memory task group: shared mappings might have performance costs (to avoid issues with mismatches between M-mode and S-mode attributes).

All the more reason that the OS/hypervisor should be allocating the block of memory for software counters.  The  OS/hypervisor will be aware of the memory attributes set up in the page tables and can make sure to use appropriate attribute settings in the PTEs that map this memory.

Greg
 


Re: Proposal v4: SBI PMU Extension

Anup Patel
 

Hi Greg,

 

The SBI_PMU_NUM_COUNTERS call will return total number of counters (HARDWARE as well as SOFTWARE).

 

We can distinguish between HARDWARE and SOFTWARE counters using the CSR_Number returned by SBI_PMU_COUNTER_GET_CSR call.

(Note: CSR_Number = 0xfff means it is SOFTWARE counter)

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 07 August 2020 00:01
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; Andrew Waterman <andrew@...>
Subject: Re: Proposal v4: SBI PMU Extension

 

Anup,

 

What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?

 

Greg

 

On Thu, Aug 6, 2020 at 6:52 AM Anup Patel <Anup.Patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.

Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.

The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
    SBI_PMU_HW_CPU_CYCLES              = 0,
    SBI_PMU_HW_INSTRUCTIONS            = 1,
    SBI_PMU_HW_CACHE_REFERENCES        = 2,
    SBI_PMU_HW_CACHE_MISSES            = 3,
    SBI_PMU_HW_BRANCH_INSTRUCTIONS     = 4,
    SBI_PMU_HW_BRANCH_MISSES           = 5,
    SBI_PMU_HW_BUS_CYCLES              = 6,
    SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
    SBI_PMU_HW_STALLED_CYCLES_BACKEND  = 8,
    SBI_PMU_HW_REF_CPU_CYCLES          = 9,
    SBI_PMU_HW_MAX,                    /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    SBI_PMU_HW_CACHE_L1D  = 0,
    SBI_PMU_HW_CACHE_L1I  = 1,
    SBI_PMU_HW_CACHE_LL   = 2,
    SBI_PMU_HW_CACHE_DTLB = 3,
    SBI_PMU_HW_CACHE_ITLB = 4,
    SBI_PMU_HW_CACHE_BPU  = 5,
    SBI_PMU_HW_CACHE_NODE = 6,
    SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    SBI_PMU_HW_CACHE_OP_READ     = 0,
    SBI_PMU_HW_CACHE_OP_WRITE    = 1,
    SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
    SBI_PMU_HW_CACHE_OP_MAX,     /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
    SBI_PMU_HW_CACHE_RESULT_MISS   = 1,
    SBI_PMU_HW_CACHE_RESULT_MAX,   /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
    SBI_PMU_SW_MISALIGNED_LOAD        = 0,
    SBI_PMU_SW_MISALIGNED_STORE       = 1,
    SBI_PMU_SW_ILLEGAL_INSN           = 2,
    SBI_PMU_SW_LOCAL_SET_TIMER        = 3,
    SBI_PMU_SW_LOCAL_IPI              = 4,
    SBI_PMU_SW_LOCAL_FENCE_I          = 5,
    SBI_PMU_SW_LOCAL_SFENCE_VMA       = 6,
    SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID  = 7,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA      = 8,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA      = 10,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
    SBI_PMU_SW_MAX,                   /* non-ABI */
};

In future, more events can be defined without breaking SBI call
backward-compatibility.

Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   Return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
   This call takes one parameter:
      1) counter_idx
   Provide the CSR_Number and CSR_Width of underlying counter.
   The value returned by SBI call is encoded as follows:
      return_value[11:0] = CSR_Number
      return_value[19:12] = CSR_Width (Number of bits implemented in HW)
          return_value[XLEN-1:20] = Reserved
   If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
   HARDWARE counter. This SBI call will fail for counters which are not
   present.

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   Find and configure a counter from a set of counters which can monitor
   specified event. The counter_idx_base and counter_idx_mask parameters
   represent the set of counters whereas the event_idx and event_info
   represent the event to monitor. Upon success the SBI call will return
   the counter_idx of the counter which has been configured to monitor
   specified event.  This SBI call will fail if it is unable to find a
   counter which can monitor specified event or the set of counters
   specified via counter_idx_base and counter_idx_mask has an invalid
   counter.

4. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters:
      1) counter_idx
      2) 8byte aligned physical address
   It will set the physical address of memory location where the SBI
   implementation will write the 64bit SOFTWARE counter. This SBI call
   is only for counters not mapped to any CSR (i.e. only for counters
   with CSR_Number == 0xfff).

5. SBI_PMU_COUNTER_START
   This call takes two parameters:
      1) counter_idx
      2) initial_value
   It will inform SBI implementation to start/enable specified counter
   with specified initial value. This SBI call will fail for counters
   which are not present.

6. SBI_PMU_COUNTER_STOP
   This call takes one parameter:
      1) counter_idx
   It will inform SBI implementation to stop/disable specified counters
   on the calling HART. This SBI call will fail for counters which are
   not present.

The OpenSBI (M-mode runtime firmware) Development Notes:

1. The OpenSBI firmware will translate event_idx and event_into into
   platform dependent MHPMEVENT CSR value before starting/enabling a
   HARDWARE counter.

2. The OpenSBI firmware will need to know following platform dependent
   information:
   A) Possible event_idx values allowed (or supported) by a HARDWARE
      counter (i.e. MHPMCOUNTER)
   B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
      value. This is optional and by default OpenSBI will write a value
          <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
          and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
   C) Additional platform-specific programming required for selecting
      event_idx + event_info combination is also optional for platform.

3. All platform dependent information mentioned above, can be obtained
   by OpenSBI firmware from platform specific code. The DT/ACPI can
   also be used to describe 2.A and 2.B mentioned above but 2.C will
   always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
   The Linux RISC-V driver can be platform driver with "riscv,pmu"
   as DT compatible string and optional "interrupts" DT property. The
   "interrupts" DT property if available should specify overflow
   interrupt for each HART. When "interrupts" DT property is present,
   we might also need another DT property for mapping HARTID to entries
   in "interrupts" DT property. The platform driver probe will:
   A) Need to ensure that underlying SBI implementation provides
      SBI PMU extension using sbi_probe_extension() API of arch/riscv.
   B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
   C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
      call. If the counter is a SOFTWARE counter then use the
          SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
      of counter. The driver can skip this in driver probe and
          instead do this lazily in add() callback mentioned below.

2. event_init() callback
   The event_init() callback will primarily translate user-space
   perf_event_attr to SBI PMU event_idx and event_info. It can do
   this in following way:
   A) perf_event_attr.type == PERF_TYPE_HARDWARE
      event_idx.type = 0x0
      event_idx.code = Value from enum sbi_pmu_hw_id based on
                           perf_event_attr.config
      event_info = 0
   B) perf_event_attr.type == PERF_TYPE_HW_CACHE
      event_idx.type = 0x1
      event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
                                    based on perf_event_attr.config
      event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
                                 based on perf_event_attr.config
      event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
                                     based on perf_event_attr.config
      event_info = 0
   C) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 0
      event_idx.type = 0x2
          event_idx.code = 0x0
          event_info = perf_event_attr.config[62:0]
   D) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 1
      event_idx.type = 0xf
          event_idx.code = Value from enum sbi_pmu_sw_id based on
                           perf_event_attr.config
          event_info = 0
   (Note: event_init() will fail if it is not able to figure out
    event_idx and event_info value corresponding to perf_event_attr)
   (Note: event_init() will not assign counter to perf_event because
    it will be done by add() callback)

3. add() callback
   The add() callback of Linux RISC-V PMU driver will find a free
   counter on current CPU/HART such that the event_idx and event_info
   combination is supported by the counter. To find-and-configure
   a counter to monitor event_idx and event_info combination from
   a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
   call.

4. del() callback
   The del() callback of Linux RISC-V PMU driver will release or
   free the counter.

5. start() callback
   The start() callback of Linux RISC-V PMU driver will start the
   counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
   The stop() callback of Linux RISC-V PMU driver will stop the
   counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: Proposal v4: SBI PMU Extension

Greg Favor
 

Anup,

Wouldn't software want to more easily and directly know from SBI_PMU_NUM_COUNTERS how many hardware counters and software counters there are, instead of having to then call SBI_PMU_COUNTER_GET_CSR N times to figure that out?

Or are you expecting that software is going to have to call SBI_PMU_COUNTER_GET_CSR N times in any case (after first calling SBI_PMU_NUM_COUNTERS to get 'N')?

Greg

P.S. I'm guessing that the valid/supported counters are the first N value of counter_idx from 0 to N-1.  Yes?

On Thu, Aug 6, 2020 at 10:33 PM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

The SBI_PMU_NUM_COUNTERS call will return total number of counters (HARDWARE as well as SOFTWARE).

 

We can distinguish between HARDWARE and SOFTWARE counters using the CSR_Number returned by SBI_PMU_COUNTER_GET_CSR call.

(Note: CSR_Number = 0xfff means it is SOFTWARE counter)

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 07 August 2020 00:01
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; Andrew Waterman <andrew@...>
Subject: Re: Proposal v4: SBI PMU Extension

 

Anup,

 

What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?

 

Greg

 

On Thu, Aug 6, 2020 at 6:52 AM Anup Patel <Anup.Patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.

Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.

The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
    SBI_PMU_HW_CPU_CYCLES              = 0,
    SBI_PMU_HW_INSTRUCTIONS            = 1,
    SBI_PMU_HW_CACHE_REFERENCES        = 2,
    SBI_PMU_HW_CACHE_MISSES            = 3,
    SBI_PMU_HW_BRANCH_INSTRUCTIONS     = 4,
    SBI_PMU_HW_BRANCH_MISSES           = 5,
    SBI_PMU_HW_BUS_CYCLES              = 6,
    SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
    SBI_PMU_HW_STALLED_CYCLES_BACKEND  = 8,
    SBI_PMU_HW_REF_CPU_CYCLES          = 9,
    SBI_PMU_HW_MAX,                    /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    SBI_PMU_HW_CACHE_L1D  = 0,
    SBI_PMU_HW_CACHE_L1I  = 1,
    SBI_PMU_HW_CACHE_LL   = 2,
    SBI_PMU_HW_CACHE_DTLB = 3,
    SBI_PMU_HW_CACHE_ITLB = 4,
    SBI_PMU_HW_CACHE_BPU  = 5,
    SBI_PMU_HW_CACHE_NODE = 6,
    SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    SBI_PMU_HW_CACHE_OP_READ     = 0,
    SBI_PMU_HW_CACHE_OP_WRITE    = 1,
    SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
    SBI_PMU_HW_CACHE_OP_MAX,     /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
    SBI_PMU_HW_CACHE_RESULT_MISS   = 1,
    SBI_PMU_HW_CACHE_RESULT_MAX,   /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
    SBI_PMU_SW_MISALIGNED_LOAD        = 0,
    SBI_PMU_SW_MISALIGNED_STORE       = 1,
    SBI_PMU_SW_ILLEGAL_INSN           = 2,
    SBI_PMU_SW_LOCAL_SET_TIMER        = 3,
    SBI_PMU_SW_LOCAL_IPI              = 4,
    SBI_PMU_SW_LOCAL_FENCE_I          = 5,
    SBI_PMU_SW_LOCAL_SFENCE_VMA       = 6,
    SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID  = 7,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA      = 8,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA      = 10,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
    SBI_PMU_SW_MAX,                   /* non-ABI */
};

In future, more events can be defined without breaking SBI call
backward-compatibility.

Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   Return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
   This call takes one parameter:
      1) counter_idx
   Provide the CSR_Number and CSR_Width of underlying counter.
   The value returned by SBI call is encoded as follows:
      return_value[11:0] = CSR_Number
      return_value[19:12] = CSR_Width (Number of bits implemented in HW)
          return_value[XLEN-1:20] = Reserved
   If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
   HARDWARE counter. This SBI call will fail for counters which are not
   present.

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   Find and configure a counter from a set of counters which can monitor
   specified event. The counter_idx_base and counter_idx_mask parameters
   represent the set of counters whereas the event_idx and event_info
   represent the event to monitor. Upon success the SBI call will return
   the counter_idx of the counter which has been configured to monitor
   specified event.  This SBI call will fail if it is unable to find a
   counter which can monitor specified event or the set of counters
   specified via counter_idx_base and counter_idx_mask has an invalid
   counter.

4. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters:
      1) counter_idx
      2) 8byte aligned physical address
   It will set the physical address of memory location where the SBI
   implementation will write the 64bit SOFTWARE counter. This SBI call
   is only for counters not mapped to any CSR (i.e. only for counters
   with CSR_Number == 0xfff).

5. SBI_PMU_COUNTER_START
   This call takes two parameters:
      1) counter_idx
      2) initial_value
   It will inform SBI implementation to start/enable specified counter
   with specified initial value. This SBI call will fail for counters
   which are not present.

6. SBI_PMU_COUNTER_STOP
   This call takes one parameter:
      1) counter_idx
   It will inform SBI implementation to stop/disable specified counters
   on the calling HART. This SBI call will fail for counters which are
   not present.

The OpenSBI (M-mode runtime firmware) Development Notes:

1. The OpenSBI firmware will translate event_idx and event_into into
   platform dependent MHPMEVENT CSR value before starting/enabling a
   HARDWARE counter.

2. The OpenSBI firmware will need to know following platform dependent
   information:
   A) Possible event_idx values allowed (or supported) by a HARDWARE
      counter (i.e. MHPMCOUNTER)
   B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
      value. This is optional and by default OpenSBI will write a value
          <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
          and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
   C) Additional platform-specific programming required for selecting
      event_idx + event_info combination is also optional for platform.

3. All platform dependent information mentioned above, can be obtained
   by OpenSBI firmware from platform specific code. The DT/ACPI can
   also be used to describe 2.A and 2.B mentioned above but 2.C will
   always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
   The Linux RISC-V driver can be platform driver with "riscv,pmu"
   as DT compatible string and optional "interrupts" DT property. The
   "interrupts" DT property if available should specify overflow
   interrupt for each HART. When "interrupts" DT property is present,
   we might also need another DT property for mapping HARTID to entries
   in "interrupts" DT property. The platform driver probe will:
   A) Need to ensure that underlying SBI implementation provides
      SBI PMU extension using sbi_probe_extension() API of arch/riscv.
   B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
   C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
      call. If the counter is a SOFTWARE counter then use the
          SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
      of counter. The driver can skip this in driver probe and
          instead do this lazily in add() callback mentioned below.

2. event_init() callback
   The event_init() callback will primarily translate user-space
   perf_event_attr to SBI PMU event_idx and event_info. It can do
   this in following way:
   A) perf_event_attr.type == PERF_TYPE_HARDWARE
      event_idx.type = 0x0
      event_idx.code = Value from enum sbi_pmu_hw_id based on
                           perf_event_attr.config
      event_info = 0
   B) perf_event_attr.type == PERF_TYPE_HW_CACHE
      event_idx.type = 0x1
      event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
                                    based on perf_event_attr.config
      event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
                                 based on perf_event_attr.config
      event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
                                     based on perf_event_attr.config
      event_info = 0
   C) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 0
      event_idx.type = 0x2
          event_idx.code = 0x0
          event_info = perf_event_attr.config[62:0]
   D) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 1
      event_idx.type = 0xf
          event_idx.code = Value from enum sbi_pmu_sw_id based on
                           perf_event_attr.config
          event_info = 0
   (Note: event_init() will fail if it is not able to figure out
    event_idx and event_info value corresponding to perf_event_attr)
   (Note: event_init() will not assign counter to perf_event because
    it will be done by add() callback)

3. add() callback
   The add() callback of Linux RISC-V PMU driver will find a free
   counter on current CPU/HART such that the event_idx and event_info
   combination is supported by the counter. To find-and-configure
   a counter to monitor event_idx and event_info combination from
   a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
   call.

4. del() callback
   The del() callback of Linux RISC-V PMU driver will release or
   free the counter.

5. start() callback
   The start() callback of Linux RISC-V PMU driver will start the
   counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
   The stop() callback of Linux RISC-V PMU driver will stop the
   counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup


Re: Proposal v4: SBI PMU Extension

Anup Patel
 

Hi Greg,

 

We want to allow RISC-V implementation specific CSRs (apart from various HPMCOUNTER CSRs) as HARDWARE counters. Due to this reason, we have decoupled counter_idx from CSR number and we treat counter_idx as logical number assigned to HARDWARE/SOFTWARE counter.

 

The S-mode software needs to call SBI_PMU_COUNTER_GET_CSR only once for each counter which can be done at boot-time OR lazily once before using the counter.

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 07 August 2020 11:13
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; Andrew Waterman <andrew@...>
Subject: Re: Proposal v4: SBI PMU Extension

 

Anup,

 

Wouldn't software want to more easily and directly know from SBI_PMU_NUM_COUNTERS how many hardware counters and software counters there are, instead of having to then call SBI_PMU_COUNTER_GET_CSR N times to figure that out?

 

Or are you expecting that software is going to have to call SBI_PMU_COUNTER_GET_CSR N times in any case (after first calling SBI_PMU_NUM_COUNTERS to get 'N')?

 

Greg

 

P.S. I'm guessing that the valid/supported counters are the first N value of counter_idx from 0 to N-1.  Yes?

 

On Thu, Aug 6, 2020 at 10:33 PM Anup Patel <Anup.Patel@...> wrote:

Hi Greg,

 

The SBI_PMU_NUM_COUNTERS call will return total number of counters (HARDWARE as well as SOFTWARE).

 

We can distinguish between HARDWARE and SOFTWARE counters using the CSR_Number returned by SBI_PMU_COUNTER_GET_CSR call.

(Note: CSR_Number = 0xfff means it is SOFTWARE counter)

 

Regards,

Anup

 

From: Greg Favor <gfavor@...>
Sent: 07 August 2020 00:01
To: Anup Patel <Anup.Patel@...>
Cc: tech-unixplatformspec@...; Atish Patra <Atish.Patra@...>; Andrew Waterman <andrew@...>
Subject: Re: Proposal v4: SBI PMU Extension

 

Anup,

 

What does SBI_PMU_NUM_COUNTERS return insofar as distinguishing hardware versus software counters?

 

Greg

 

On Thu, Aug 6, 2020 at 6:52 AM Anup Patel <Anup.Patel@...> wrote:

Hi All,

We don't have a dedicated RISC-V PMU extension for all privilege modes
but we do have M-mode HARDWARE performance counters such as MCYCLE CSR,
MINSTRET CSR, and MHPMCOUNTER CSRs which are read-only for S-mode and
U-mode. A RISC-V implementation can support monitoring of various
HARDWARE events using limited number of HARDWARE performance counters.

In addition to HARDWARE performance counters, a SBI implementation
(e.g. OpenSBI, Xvisor, KVM, etc) can provide SOFTWARE counters for
events such as number of RFENCEs, number of IPIs, number of misaligned
load/store instructions, number of illegal instructions, etc.

We propose SBI PMU extension, which will help S-mode (or VS-mode)
software to discover and configure HARDWARE/SOFTWARE counters. The SBI
PMU extension will only manage per-HART (or per-CPU) HARDWARE/SOFTWARE
counters.

Using SBI PMU extension, a SBI implementation (OpenSBI, KVM, or Xvisor)
will provide a standardized view of HARDWARE/SOFTWARE counters and
events to S-mode (or VS-mode) software.

Before defining SBI PMU extension calls, we first define counter_idx,
event_idx, and event_info entities. The counter_idx is a logical number
assigned to each HARDWARE/SOFTWARE counter. The event_idx represents a
HARDWARE/SOFTWARE event whereas event_info represents additional
configuration/parameters for the event.

The event_idx is a 20bits wide number encoded as follows:
event_idx[19:16] = type
event_idx[15:0] = code

If event_idx.type == 0x0 then it is HARDWARE event. For HARDWARE event,
the event_info is optional and can be zero whereas the event_idx.code
can be one of the following values:
enum sbi_pmu_hw_id {
    SBI_PMU_HW_CPU_CYCLES              = 0,
    SBI_PMU_HW_INSTRUCTIONS            = 1,
    SBI_PMU_HW_CACHE_REFERENCES        = 2,
    SBI_PMU_HW_CACHE_MISSES            = 3,
    SBI_PMU_HW_BRANCH_INSTRUCTIONS     = 4,
    SBI_PMU_HW_BRANCH_MISSES           = 5,
    SBI_PMU_HW_BUS_CYCLES              = 6,
    SBI_PMU_HW_STALLED_CYCLES_FRONTEND = 7,
    SBI_PMU_HW_STALLED_CYCLES_BACKEND  = 8,
    SBI_PMU_HW_REF_CPU_CYCLES          = 9,
    SBI_PMU_HW_MAX,                    /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x1 then it is HARDWARE CACHE event. For HARDWARE
CACHE event, the event_info is optional and can be zero whereas the
event_idx.code is encoded as follows:
event_idx.code[15:3] = cache_id
event_idx.code[2:1] = op_id
event_idx.code[0:0] = result_id
enum sbi_pmu_hw_cache_id {
    SBI_PMU_HW_CACHE_L1D  = 0,
    SBI_PMU_HW_CACHE_L1I  = 1,
    SBI_PMU_HW_CACHE_LL   = 2,
    SBI_PMU_HW_CACHE_DTLB = 3,
    SBI_PMU_HW_CACHE_ITLB = 4,
    SBI_PMU_HW_CACHE_BPU  = 5,
    SBI_PMU_HW_CACHE_NODE = 6,
    SBI_PMU_HW_CACHE_MAX, /* non-ABI */
};
enum sbi_pmu_hw_cache_op_id {
    SBI_PMU_HW_CACHE_OP_READ     = 0,
    SBI_PMU_HW_CACHE_OP_WRITE    = 1,
    SBI_PMU_HW_CACHE_OP_PREFETCH = 2,
    SBI_PMU_HW_CACHE_OP_MAX,     /* non-ABI */
};
enum sbi_pmu_hw_cache_op_result_id {
    SBI_PMU_HW_CACHE_RESULT_ACCESS = 0,
    SBI_PMU_HW_CACHE_RESULT_MISS   = 1,
    SBI_PMU_HW_CACHE_RESULT_MAX,   /* non-ABI */
};
(NOTE: Same as <linux_source>/include/uapi/linux/perf_event.h)

If event_idx.type == 0x2 then it is HARDWARE RAW event. For HARDWARE
RAW event, the event_idx.code should be zero and the event_info
parameter passed to SBI_PMU_COUNTER_CONFIG_MATCHING call (described
below) will have the RAW event value to be programmed in MHPMEVENT
CSR (i.e. the SBI implementation will not derive MHPMEVENT CSR value
from event_idx and event_info).

If event_idx.type == 0xf then it is SOFTWARE event. For SOFTWARE
event, the event_info is optional and can be zero whereas the
event_idx.code can be one of the following:
enum sbi_pmu_sw_id {
    SBI_PMU_SW_MISALIGNED_LOAD        = 0,
    SBI_PMU_SW_MISALIGNED_STORE       = 1,
    SBI_PMU_SW_ILLEGAL_INSN           = 2,
    SBI_PMU_SW_LOCAL_SET_TIMER        = 3,
    SBI_PMU_SW_LOCAL_IPI              = 4,
    SBI_PMU_SW_LOCAL_FENCE_I          = 5,
    SBI_PMU_SW_LOCAL_SFENCE_VMA       = 6,
    SBI_PMU_SW_LOCAL_SFENCE_VMA_ASID  = 7,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA      = 8,
    SBI_PMU_SW_LOCAL_HFENCE_GVMA_VMID = 9,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA      = 10,
    SBI_PMU_SW_LOCAL_HFENCE_VVMA_ASID = 11,
    SBI_PMU_SW_MAX,                   /* non-ABI */
};

In future, more events can be defined without breaking SBI call
backward-compatibility.

Using above definitions of counter_idx, event_idx, and event_info
we can potentially have following SBI calls:

1. SBI_PMU_NUM_COUNTERS
   Return the number of COUNTERs

2. SBI_PMU_COUNTER_GET_CSR
   This call takes one parameter:
      1) counter_idx
   Provide the CSR_Number and CSR_Width of underlying counter.
   The value returned by SBI call is encoded as follows:
      return_value[11:0] = CSR_Number
      return_value[19:12] = CSR_Width (Number of bits implemented in HW)
          return_value[XLEN-1:20] = Reserved
   If CSR_Number == 0xfff then it is SOFTWARE counter otherwise it is
   HARDWARE counter. This SBI call will fail for counters which are not
   present.

3. SBI_PMU_COUNTER_CONFIG_MATCHING
   This call takes three parameter:
      1) counter_idx_base
      2) counter_idx_mask
      3) event_idx
      4) event_info
   Find and configure a counter from a set of counters which can monitor
   specified event. The counter_idx_base and counter_idx_mask parameters
   represent the set of counters whereas the event_idx and event_info
   represent the event to monitor. Upon success the SBI call will return
   the counter_idx of the counter which has been configured to monitor
   specified event.  This SBI call will fail if it is unable to find a
   counter which can monitor specified event or the set of counters
   specified via counter_idx_base and counter_idx_mask has an invalid
   counter.

4. SBI_PMU_COUNTER_SET_PHYS_ADDR
   This call takes two parameters:
      1) counter_idx
      2) 8byte aligned physical address
   It will set the physical address of memory location where the SBI
   implementation will write the 64bit SOFTWARE counter. This SBI call
   is only for counters not mapped to any CSR (i.e. only for counters
   with CSR_Number == 0xfff).

5. SBI_PMU_COUNTER_START
   This call takes two parameters:
      1) counter_idx
      2) initial_value
   It will inform SBI implementation to start/enable specified counter
   with specified initial value. This SBI call will fail for counters
   which are not present.

6. SBI_PMU_COUNTER_STOP
   This call takes one parameter:
      1) counter_idx
   It will inform SBI implementation to stop/disable specified counters
   on the calling HART. This SBI call will fail for counters which are
   not present.

The OpenSBI (M-mode runtime firmware) Development Notes:

1. The OpenSBI firmware will translate event_idx and event_into into
   platform dependent MHPMEVENT CSR value before starting/enabling a
   HARDWARE counter.

2. The OpenSBI firmware will need to know following platform dependent
   information:
   A) Possible event_idx values allowed (or supported) by a HARDWARE
      counter (i.e. MHPMCOUNTER)
   B) Mapping of event_idx for HARDWARE/CACHE event to MHPMEVENT CSR
      value. This is optional and by default OpenSBI will write a value
          <xyz> to MHPMEVENT CSR where lower 20bits of <xyz> is event_idx
          and upper XLEN-20 bits of <xyz> are lower XLEN-20 bits of event_info
   C) Additional platform-specific programming required for selecting
      event_idx + event_info combination is also optional for platform.

3. All platform dependent information mentioned above, can be obtained
   by OpenSBI firmware from platform specific code. The DT/ACPI can
   also be used to describe 2.A and 2.B mentioned above but 2.C will
   always require platform specific code.

Linux RISC-V PMU Driver Development Notes:

1. Driver probe
   The Linux RISC-V driver can be platform driver with "riscv,pmu"
   as DT compatible string and optional "interrupts" DT property. The
   "interrupts" DT property if available should specify overflow
   interrupt for each HART. When "interrupts" DT property is present,
   we might also need another DT property for mapping HARTID to entries
   in "interrupts" DT property. The platform driver probe will:
   A) Need to ensure that underlying SBI implementation provides
      SBI PMU extension using sbi_probe_extension() API of arch/riscv.
   B) Detect number of counters using SBI_PMU_NUM_COUNTERS call
   C) Get CSR details of each counter using SBI_PMU_COUNTER_GET_CSR
      call. If the counter is a SOFTWARE counter then use the
          SBI_PMU_COUNTER_SET_PHYS_ADDR call to set memory location
      of counter. The driver can skip this in driver probe and
          instead do this lazily in add() callback mentioned below.

2. event_init() callback
   The event_init() callback will primarily translate user-space
   perf_event_attr to SBI PMU event_idx and event_info. It can do
   this in following way:
   A) perf_event_attr.type == PERF_TYPE_HARDWARE
      event_idx.type = 0x0
      event_idx.code = Value from enum sbi_pmu_hw_id based on
                           perf_event_attr.config
      event_info = 0
   B) perf_event_attr.type == PERF_TYPE_HW_CACHE
      event_idx.type = 0x1
      event_idx.code.cache_id = Value from enum sbi_pmu_hw_cache_id
                                    based on perf_event_attr.config
      event_idx.code.op_id = Value from enum sbi_pmu_hw_op_id
                                 based on perf_event_attr.config
      event_idx.code.result_id = Value from enum sbi_pmu_hw_result_id
                                     based on perf_event_attr.config
      event_info = 0
   C) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 0
      event_idx.type = 0x2
          event_idx.code = 0x0
          event_info = perf_event_attr.config[62:0]
   D) perf_event_attr.type == PERF_TYPE_RAW and
      perf_event_attr.config[63:63] == 1
      event_idx.type = 0xf
          event_idx.code = Value from enum sbi_pmu_sw_id based on
                           perf_event_attr.config
          event_info = 0
   (Note: event_init() will fail if it is not able to figure out
    event_idx and event_info value corresponding to perf_event_attr)
   (Note: event_init() will not assign counter to perf_event because
    it will be done by add() callback)

3. add() callback
   The add() callback of Linux RISC-V PMU driver will find a free
   counter on current CPU/HART such that the event_idx and event_info
   combination is supported by the counter. To find-and-configure
   a counter to monitor event_idx and event_info combination from
   a set of counters, we will use the SBI_PMU_COUNTER_CONFIG_MATCHING
   call.

4. del() callback
   The del() callback of Linux RISC-V PMU driver will release or
   free the counter.

5. start() callback
   The start() callback of Linux RISC-V PMU driver will start the
   counter using the SBI_PMU_COUNTER_START call.

6. stop() callback
   The stop() callback of Linux RISC-V PMU driver will stop the
   counter using the SBI_PMU_COUNTER_STOP call.

Regards,
Anup

161 - 180 of 1847