Re: Quality of Service (QoS)
Ved Shanbhogue
On Wed, Nov 10, 2021 at 07:06:41AM -0800, Allen Baum wrote:
Presently ASID is defined to be private to a hart. This was clarified in version 1.11 of the privileged specification but there was commentary added about possibility of a future global-ASID. However, for QoS purposes the ASID may not lend itself as well as an identifier. The system may want to group multiple applications/virtual-machines/containers into a resource control group. Further the ASID does not help differentiate between code execution vs. data access. One way that could have been addressed is to carry a code/data indicator along with the request but that may create some inefficiencies sicne in the resource controllers now there will be two sets of controls/counters per ID (one for code and other for data), but when differentiated service for code vs. data is not required it may lead to the per-ID code counters/controls to be not used. To support grouping a lookup table may be employed in hardware to group multiple ASIDs together but it increases hardware complexity especially for high speed implementations to have a lookup table accessed on each request. So we may want to keep the hardware simpler and let the grouping be done by software.
So to support QoS we may want to provide a mechanism by which an application can be associated with a resource control ID (RCID) and a monitoring counter ID (MCID) that accompany each request made by the application. We would also want a mechanism to associate these IDs with request made by a device on behalf of the application. Here the term application is used generically to refer to a process or a VM or a container or other abstractions employed by the system for resource control.
An application would be associated with one RCID and one MCID that accompany its requests for data accesses and a potentially diffferent RCID and MCID that accompany its requests for code accesses. Data accesses include requests generated by load and store instructions as well as the implicit loads and stores to the first-stage and second-stage page tables. Where differentiated QoS for code vs. data is not required, the code and data RCID and MCID may be programmed to be the same.
A group of applications may be associated with the same RCID and one or more of these applications may be associated with a unique MCID for code and/or data. This allows measuring the resource consumption of a subset of applications that share a RCID to determine if the resource partitioning is optimal and to make adjustments as needed.
The RCID and MCID would want to have a global scope across all caches, interconnect, and memory controllers that a request may access. To support maximum flexibility, the RCID and MCID may be defined to be up to 16-bits wide but could be limited to more reasonable numbers by an implementation e.g. 64 or 128 resource control IDs.
These IDs may thus be programmed into a set of CSRs (one each for M/S/VS mode) where each CSR is 64 bit wide holding the RCID and MCID for code and data accesses respectively. For device initiated accesses these IDs could be programmed into the IOMMU such that the IOMMU. Other implementations may support directly configuring these IDs into the devices themself.
regards
ved
There is already a process identifier defined in the architecture (ASID)As you rightly pointed out monitoring or allocation of resources requires a way to identify the originator of the request. Traditionally, as the request proceeds downstream through the network of resources, there is no way to associate the request with a specific application or group of applications. In some usages, in addition to providing differentiated service among applications, the ability to differentiate between resource usage for code execution and for data accesses of the same application may be required.
though it is local and not global across a system.
I vaguely remember that the IOMMU and/or IOPMP proposals make use of
something similar.
Leveraging off those proposals would seem to be desirable if they ift.
Presently ASID is defined to be private to a hart. This was clarified in version 1.11 of the privileged specification but there was commentary added about possibility of a future global-ASID. However, for QoS purposes the ASID may not lend itself as well as an identifier. The system may want to group multiple applications/virtual-machines/containers into a resource control group. Further the ASID does not help differentiate between code execution vs. data access. One way that could have been addressed is to carry a code/data indicator along with the request but that may create some inefficiencies sicne in the resource controllers now there will be two sets of controls/counters per ID (one for code and other for data), but when differentiated service for code vs. data is not required it may lead to the per-ID code counters/controls to be not used. To support grouping a lookup table may be employed in hardware to group multiple ASIDs together but it increases hardware complexity especially for high speed implementations to have a lookup table accessed on each request. So we may want to keep the hardware simpler and let the grouping be done by software.
So to support QoS we may want to provide a mechanism by which an application can be associated with a resource control ID (RCID) and a monitoring counter ID (MCID) that accompany each request made by the application. We would also want a mechanism to associate these IDs with request made by a device on behalf of the application. Here the term application is used generically to refer to a process or a VM or a container or other abstractions employed by the system for resource control.
An application would be associated with one RCID and one MCID that accompany its requests for data accesses and a potentially diffferent RCID and MCID that accompany its requests for code accesses. Data accesses include requests generated by load and store instructions as well as the implicit loads and stores to the first-stage and second-stage page tables. Where differentiated QoS for code vs. data is not required, the code and data RCID and MCID may be programmed to be the same.
A group of applications may be associated with the same RCID and one or more of these applications may be associated with a unique MCID for code and/or data. This allows measuring the resource consumption of a subset of applications that share a RCID to determine if the resource partitioning is optimal and to make adjustments as needed.
The RCID and MCID would want to have a global scope across all caches, interconnect, and memory controllers that a request may access. To support maximum flexibility, the RCID and MCID may be defined to be up to 16-bits wide but could be limited to more reasonable numbers by an implementation e.g. 64 or 128 resource control IDs.
These IDs may thus be programmed into a set of CSRs (one each for M/S/VS mode) where each CSR is 64 bit wide holding the RCID and MCID for code and data accesses respectively. For device initiated accesses these IDs could be programmed into the IOMMU such that the IOMMU. Other implementations may support directly configuring these IDs into the devices themself.
regards
ved
On Wed, Nov 10, 2021 at 6:40 AM mark <markhimelstein@...> wrote:Vedvyas,
Thank you.
We have a RAS committee on the org and approved by the BOD but has not
been formed and QOS is one part of what it was intended to look at (as part
of availability).
I wonder if we can't use this as an opportunity to initiate this
committee. Once it has strategy,gaps, and priorities (through itself for a
SIG), the idea is the committee work with Priv to create a TG.
We would need an acting committee chair to drive this. Policy here
<https://docs.google.com/document/d/14ZpciYwIzmuiB92_hKfwTAttTnc3rsLbWI-CpC7MdC8/edit?usp=sharing>
.
Mark
On Wed, Nov 10, 2021 at 6:11 AM Vedvyas Shanbhogue <ved@...>
wrote:Greeting all,
I would like to start a discussion on supporting QoS capabilities in
RISC-V architecture. I hope I am posting on the right list/TG/HC.
First, a short background:
Quality of Service (QoS) is the minimal end-to-end performance that is
guaranteed in advance by a service level agreement (SLA) to an application.
The performance may be measured in the form of metrics such as instructions
per cycle (IPC), latency of servicing work, etc.
Various factors such as the available cache capacity, memory bandwidth,
interconnect bandwidth, CPU cycles, system memory, etc. affect the
performance in a computing system that runs multiple applications
concurrently. Further when there is arbitration required for shared
resources, the prioritization of the applications requests against other
competing requests may also affect the performance of the application.
When multiple applications are running concurrently on modern processors
with large core counts, multiple cache hierarchies, and multiple memory
controllers, the performance of an application becomes less deterministic
or even non-deterministic as the performance depends on the behavior of all
the other applications in the machine that contend for the shared resources
leading to interference. In many deployment scenarios such as public cloud
servers the application owner may not be in control of the type and
placement of other applications in the platform.
A typical use model involves profiling the resource usage of the
application to meet desired performance goals and to establish resource
allocations/limits for the application to acheive those goals.
System software can control some of these resources available to the
application such as the number of hardware threads made available for
execution, the amount of system memory allocated to the applications, the
number of CPU cycles provided for execution, etc. but presently lacks the
capabilities to control interference to an application and thereby reduce
variability in performance experienced by an application due to other
applications use of capacity, memory bandwidth, interconnect bandwidth, etc.
Some thoughts on supporting such capability:
1. To provide differentiated services in the platform a CSR may be
provided to associate an identifier with a application (e.g. process, VM,
container, etc). This identifier is then associated with requests to access
to the shared resources such as caches, interconnects, memory, etc.
2. Configuration registers and counters are needed in resource
controllers e.g. memory, cache, interconnect, etc. to setup resource
allocations and monitor resource usage. The controllers may use the
identifiers associated with requests to enforce the configured resource
allocations and/or monitor the resource consumption.
Please share your comments and feedback. If there is WIP already please
point me to that.
regards
ved