
Kumar Sankaran
Signed-off-by: Kumar Sankaran <ksankaran@...> --- riscv-platform-spec.adoc | 42 ++++++++++++++++++++++++++-------------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/riscv-platform-spec.adoc b/riscv-platform-spec.adoc index 4c356b8..d779452 100644 --- a/riscv-platform-spec.adoc +++ b/riscv-platform-spec.adoc @@ -19,18 +19,6 @@ // table of contents toc::[] -// document copyright and licensing information -include::licensing.adoc[] - -// changelog for the document -include::changelog.adoc[] - -// Introduction: describe the intent and purpose of the document -include::introduction.adoc[] - -// Profiles: (NB: content from very first version) -include::profiles.adoc[] - == Introduction The platform specification defines a set of platforms that specify requirements for interoperability between software and hardware. The platform policy @@ -68,11 +56,13 @@ The M platform has the following extensions: |SBI | Supervisor Binary Interface |UEFI | Unified Extensible Firmware Interface |ACPI | Advanced Configuration and Power Interface +|APEI | ACPI Platform Error Interfaces |SMBIOS | System Management Basic I/O System |DTS | Devicetree source file |DTB | Devicetree binary |RVA22 | RISC-V Application 2022 |EE | Execution Environment +|OSPM | Operating System Power Management |RV32GC | RISC-V 32-bit general purpose ISA described as RV32IMAFDC. |RV64GC | RISC-V 64-bit general purpose ISA described as RV64IMAFDC. |=== @@ -87,6 +77,7 @@ The M platform has the following extensions: |link:[RVA22 Specification] | TBD |link: https://arm-software.github.io/ebbr/[EBBR Specification] | v2.0.0-pre1 |link: https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf[ACPI Specification] | v6.4 +|link: https://uefi.org/specs/ACPI/6.4/18_ACPI_Platform_Error_Interfaces/ACPI_PLatform_Error_Interfaces.html[APEI Specification] | v6.4 |link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf[SMBIOS Specification] | v3.4.0 |link:[Platform Policy] | TBD |=== @@ -504,6 +495,30 @@ delegate the virtual supervisor timer interrupt to 'VS' mode. * IOMMU ==== RAS +All the below mentioned RAS features are required for the OS-A platform server +extension + +* Main memory must be protected with SECDED-ECC + +* All cache structures must be protected + +** single-bit errors must be detected and corrected + +** multi-bit errors can be detected and reported + +* There must be memory-mapped RAS registers associated with these protected +structures to log detected errors with information about the type and location +of the error + +* The platform must support the APEI specification to convey all error +information to OSPM + +* Correctable errors must be reported by hardware and either be corrected or +recovered by hardware, transparent to system operation and to software + +* Hardware must provide status of these correctable errors via RAS registers + +* Uncorrectable errors must be reported by the hardware via RAS error +registers for system software to take the needed corrective action + +* Attempted use of corrupted (uncorrectable) data must result in a precise +exception on that instruction with a distinguishing custom exception cause +code + +* Errors logged in RAS registers must be able to generate an interrupt request +to the system interrupt controller that may be directed to either M-mode or +S/HS-mode for firmware-first versus OS-first error reporting + +* PCIe AER capability is required + // M Platform == M Platform @@ -593,6 +608,3 @@ also implement PMP support. When PMP is supported it is recommended to include at least 4 regions, although if possible more should be supported to allow more flexibility. Hardware implementations should aim for supporting at least 16 PMP regions. - -// acknowledge all of the contributors -include::contributors.adoc[] -- 2.21.0
|
|

Abner Chang
Signed-off-by: Kumar Sankaran <ksankaran@...>
---
riscv-platform-spec.adoc | 42 ++++++++++++++++++++++++++--------------
1 file changed, 27 insertions(+), 15 deletions(-)
diff --git a/riscv-platform-spec.adoc b/riscv-platform-spec.adoc
index 4c356b8..d779452 100644
--- a/riscv-platform-spec.adoc
+++ b/riscv-platform-spec.adoc
@@ -19,18 +19,6 @@
// table of contents
toc::[]
-// document copyright and licensing information
-include::licensing.adoc[]
-
-// changelog for the document
-include::changelog.adoc[]
-
-// Introduction: describe the intent and purpose of the document
-include::introduction.adoc[]
-
-// Profiles: (NB: content from very first version)
-include::profiles.adoc[]
-
== Introduction
The platform specification defines a set of platforms that specify requirements
for interoperability between software and hardware. The platform policy
@@ -68,11 +56,13 @@ The M platform has the following extensions:
|SBI | Supervisor Binary Interface
|UEFI | Unified Extensible Firmware Interface
|ACPI | Advanced Configuration and Power Interface
+|APEI | ACPI Platform Error Interfaces
|SMBIOS | System Management Basic I/O System
|DTS | Devicetree source file
|DTB | Devicetree binary
|RVA22 | RISC-V Application 2022
|EE | Execution Environment
+|OSPM | Operating System Power Management
|RV32GC | RISC-V 32-bit general purpose ISA described as RV32IMAFDC.
|RV64GC | RISC-V 64-bit general purpose ISA described as RV64IMAFDC.
|===
@@ -87,6 +77,7 @@ The M platform has the following extensions:
|link:[RVA22 Specification]
| TBD
|link:https://arm-software.github.io/ebbr/[EBBR Specification]
| v2.0.0-pre1
|link:https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf[ACPI
Specification] | v6.4
+|link:https://uefi.org/specs/ACPI/6.4/18_ACPI_Platform_Error_Interfaces/ACPI_PLatform_Error_Interfaces.html[APEI
Specification] | v6.4
|link:https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf[SMBIOS
Specification] | v3.4.0
|link:[Platform Policy]
| TBD
|===
@@ -504,6 +495,30 @@ delegate the virtual supervisor timer interrupt
to 'VS' mode.
* IOMMU
==== RAS
+All the below mentioned RAS features are required for the OS-A platform server
+extension
+
+* Main memory must be protected with SECDED-ECC +
+* All cache structures must be protected +
+** single-bit errors must be detected and corrected +
+** multi-bit errors can be detected and reported +
+* There must be memory-mapped RAS registers associated with these protected
+structures to log detected errors with information about the type and location
+of the error +
+* The platform must support the APEI specification to convey all error
+information to OSPM +
+* Correctable errors must be reported by hardware and either be corrected or
+recovered by hardware, transparent to system operation and to software +
+* Hardware must provide status of these correctable errors via RAS registers +
+* Uncorrectable errors must be reported by the hardware via RAS error
+registers for system software to take the needed corrective action +
+* Attempted use of corrupted (uncorrectable) data must result in a precise
+exception on that instruction with a distinguishing custom exception cause
+code +
+* Errors logged in RAS registers must be able to generate an interrupt request
+to the system interrupt controller that may be directed to either M-mode or
+S/HS-mode for firmware-first versus OS-first error reporting +
+* PCIe AER capability is required +
Hi Kumar, I would like to add something. In order to support the OEM RAS policy,
- The platform should provide the capability to configure each RAS error to trigger firmware-first
or
OS-first error interrupt. - If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode. - Each RAS error should be able to mask through RAS configuration registers. - We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.
For PCIe RAS, - The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy. Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART
locally?
Regards, Abner
// M Platform
== M Platform
@@ -593,6 +608,3 @@ also implement PMP support.
When PMP is supported it is recommended to include at least 4 regions, although
if possible more should be supported to allow more flexibility. Hardware
implementations should aim for supporting at least 16 PMP regions.
-
-// acknowledge all of the contributors
-include::contributors.adoc[]
--
2.21.0
|
|
- The platform should provide the capability to configure each RAS error to trigger firmware-first
or
OS-first error interrupt.
- If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.
I would have thought that this is just a software issue. What kind of hardware mechanism do you picture being needed? - Each RAS error should be able to mask through RAS configuration registers.
By "mask" do you mean masking of generation of an error interrupt? - We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.
Wouldn't the TEE be running in M-mode? Or where is it expected to be running? For PCIe RAS,
- The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.
In x86 and ARM platforms, doesn't the OS pretty much always handle PCIe AER errors (i.e. OS-first for this class of errors)? (I was reading an Intel overview doc recently that essentially said that - irrespective of whether other classes of errors are OS-first or firmware-first).)
Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART
locally?
Definitely there will be processor/hart errors. Presumably each hart would output one or more RAS interrupt request signals.
Greg
|
|

Allen Baum
Is it acceptable to everyone that all single bit errors on all caches must be correctable? That really affects designs in fundamental ways for L1 caches (as opposed to simply detecting). Not as big a concern for L2 and above. Speaking from my Intel experience, the rule was expressed as failures per year - and if an L1 cache was small enough to exceed that number, then it didn't need correction. So, it might be useful to have a measurement baseline like that, rather than an absolute requirement.
The argument is why are you requiring ecc correction on this - and not the register file, or CSRs? The reason is they're small enough that failures are unlikely - and that's what your rationale should be stated. There will be platforms that are much more demanding (safety critical) where duplication is required, or majority voting. I didn't think that we were talking about those application areas.
toggle quoted message
Show quoted text
Signed-off-by: Kumar Sankaran <ksankaran@...>
---
riscv-platform-spec.adoc | 42 ++++++++++++++++++++++++++--------------
1 file changed, 27 insertions(+), 15 deletions(-)
diff --git a/riscv-platform-spec.adoc b/riscv-platform-spec.adoc
index 4c356b8..d779452 100644
--- a/riscv-platform-spec.adoc
+++ b/riscv-platform-spec.adoc
@@ -19,18 +19,6 @@
// table of contents
toc::[]
-// document copyright and licensing information
-include::licensing.adoc[]
-
-// changelog for the document
-include::changelog.adoc[]
-
-// Introduction: describe the intent and purpose of the document
-include::introduction.adoc[]
-
-// Profiles: (NB: content from very first version)
-include::profiles.adoc[]
-
== Introduction
The platform specification defines a set of platforms that specify requirements
for interoperability between software and hardware. The platform policy
@@ -68,11 +56,13 @@ The M platform has the following extensions:
|SBI | Supervisor Binary Interface
|UEFI | Unified Extensible Firmware Interface
|ACPI | Advanced Configuration and Power Interface
+|APEI | ACPI Platform Error Interfaces
|SMBIOS | System Management Basic I/O System
|DTS | Devicetree source file
|DTB | Devicetree binary
|RVA22 | RISC-V Application 2022
|EE | Execution Environment
+|OSPM | Operating System Power Management
|RV32GC | RISC-V 32-bit general purpose ISA described as RV32IMAFDC.
|RV64GC | RISC-V 64-bit general purpose ISA described as RV64IMAFDC.
|===
@@ -87,6 +77,7 @@ The M platform has the following extensions:
|link:[RVA22 Specification]
| TBD
|link:https://arm-software.github.io/ebbr/[EBBR Specification]
| v2.0.0-pre1
|link:https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf[ACPI
Specification] | v6.4
+|link:https://uefi.org/specs/ACPI/6.4/18_ACPI_Platform_Error_Interfaces/ACPI_PLatform_Error_Interfaces.html[APEI
Specification] | v6.4
|link:https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf[SMBIOS
Specification] | v3.4.0
|link:[Platform Policy]
| TBD
|===
@@ -504,6 +495,30 @@ delegate the virtual supervisor timer interrupt
to 'VS' mode.
* IOMMU
==== RAS
+All the below mentioned RAS features are required for the OS-A platform server
+extension
+
+* Main memory must be protected with SECDED-ECC +
+* All cache structures must be protected +
+** single-bit errors must be detected and corrected +
+** multi-bit errors can be detected and reported +
+* There must be memory-mapped RAS registers associated with these protected
+structures to log detected errors with information about the type and location
+of the error +
+* The platform must support the APEI specification to convey all error
+information to OSPM +
+* Correctable errors must be reported by hardware and either be corrected or
+recovered by hardware, transparent to system operation and to software +
+* Hardware must provide status of these correctable errors via RAS registers +
+* Uncorrectable errors must be reported by the hardware via RAS error
+registers for system software to take the needed corrective action +
+* Attempted use of corrupted (uncorrectable) data must result in a precise
+exception on that instruction with a distinguishing custom exception cause
+code +
+* Errors logged in RAS registers must be able to generate an interrupt request
+to the system interrupt controller that may be directed to either M-mode or
+S/HS-mode for firmware-first versus OS-first error reporting +
+* PCIe AER capability is required +
Hi Kumar, I would like to add something. In order to support the OEM RAS policy,
- The platform should provide the capability to configure each RAS error to trigger firmware-first
or
OS-first error interrupt. - If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode. - Each RAS error should be able to mask through RAS configuration registers. - We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.
For PCIe RAS, - The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy. Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART
locally?
Regards, Abner
// M Platform
== M Platform
@@ -593,6 +608,3 @@ also implement PMP support.
When PMP is supported it is recommended to include at least 4 regions, although
if possible more should be supported to allow more flexibility. Hardware
implementations should aim for supporting at least 16 PMP regions.
-
-// acknowledge all of the contributors
-include::contributors.adoc[]
--
2.21.0
|
|
Is it acceptable to everyone that all single bit errors on all caches must be correctable?
Nowadays single-bit errors are far from rare. There will always be people that run Linux and are willing to accept occasional silent corruptions and whatever mysterious application/data corruptions occur as a result. But for a standardized server-class platform spec, this is a rather low "table stakes" bar to set. Virtually no customer of a "server-class" platform will be comfortable without that (especially since the x86 and ARM alternatives provide at least that). That really affects designs in fundamental ways for L1 caches (as opposed to simply detecting).
Parity (and invalidate on error detection) suffices for I and WT D caches; and ECC is used on WB D caches. Even L1 D caches (which is one argument for doing a WT L1 D cache with parity, but the majority of people still do WB L1 D caches with ECC).
Understandably some people don't want to deal with ECC on a WB DL1, and parity or nothing may be fine for less-than server-class systems. Not as big a concern for L2 and above. Speaking from my Intel experience, the rule was expressed as failures per year - and if an L1 cache was small enough to exceed that number, then it didn't need correction.
Somewhat analogous, TSMC imposes similarly expressed requirements wrt having redundancy in all the RAMs. Even just one
non-redundant
64 KiB cache can pretty much use up what is allowed to not have redundancy.
In any case, the Base platform spec should allow people to make whatever choice they want (and live with the consequences). But to be competitive and to meet customer expectations (especially in a multi-core world), the Server spec needs to require a higher-than-nothing bar. So, it might be useful to have a measurement baseline like that, rather than an absolute requirement.
A functional requirement is simple to specify and aligns with standard
industry
practices. The alternatives get more involved and in practice won't provide much of any value over the functional requirement (for server-class systems).
The argument is why are you requiring ecc correction on this - and not the register file, or CSRs?
This is a baseline requirement - aligned with common/dominant industry practice. Conversely it is not a dominant industry practice to protect flop-based register files (or flop-based storage structures in general). (Latch-based register files, depending on whether the bitcell is more SRAM-like or flop-like, fall in one category or the other.)
The reason is they're small enough that failures are unlikely - and that's what your rationale should be stated.
Nowadays even the aggregate error rate or MTBF due to flop soft errors is not small. But thankfully for most designs that MTBF component is acceptable within typical MTBF budgets.
As far as instead specifying an MTBF requirement, one
then
gets into system-wide issues and overall MTBF budgets, where it gets spent, what about the technology dependence of all this, and .... Plus that effectively would provide little guidance to CPU designers as to what is their individual MTBF budget. Or, conversely, one can probably have long discussions/arguments about what is the right MTBF number to require at the level of a single CPU core.
But at the end of the day very few or virtually no customer of a server-class system is going to accept a product that doesn't even have single-bit error protection on the cache hierarchy.
Greg
|
|

Kumar Sankaran
To add to what Greg mentioned below, the RAS features as mentioned in the patch is required only for the OS-A platform server extension. We are not mandating any RAS requirements for the OS-A base platform compatibility. Regards Kumar
toggle quoted message
Show quoted text
From: Greg Favor < gfavor@...> Sent: Thursday, June 17, 2021 11:54 AM To: Allen Baum < allen.baum@...> Cc: Abner Chang < renba.chang@...>; Kumar Sankaran < ksankaran@...>; tech-unixplatformspec@...Subject: Re: [RISC-V] [tech-unixplatformspec] [PATCH 1/1] RAS features for OS-A platform server extension Is it acceptable to everyone that all single bit errors on all caches must be correctable?
Nowadays single-bit errors are far from rare. There will always be people that run Linux and are willing to accept occasional silent corruptions and whatever mysterious application/data corruptions occur as a result. But for a standardized server-class platform spec, this is a rather low "table stakes" bar to set. Virtually no customer of a "server-class" platform will be comfortable without that (especially since the x86 and ARM alternatives provide at least that). That really affects designs in fundamental ways for L1 caches (as opposed to simply detecting).
Parity (and invalidate on error detection) suffices for I and WT D caches; and ECC is used on WB D caches. Even L1 D caches (which is one argument for doing a WT L1 D cache with parity, but the majority of people still do WB L1 D caches with ECC). Understandably some people don't want to deal with ECC on a WB DL1, and parity or nothing may be fine for less-than server-class systems. Not as big a concern for L2 and above. Speaking from my Intel experience, the rule was expressed as failures per year - and if an L1 cache was small enough to exceed that number, then it didn't need correction.
Somewhat analogous, TSMC imposes similarly expressed requirements wrt having redundancy in all the RAMs. Even just one non-redundant 64 KiB cache can pretty much use up what is allowed to not have redundancy. In any case, the Base platform spec should allow people to make whatever choice they want (and live with the consequences). But to be competitive and to meet customer expectations (especially in a multi-core world), the Server spec needs to require a higher-than-nothing bar. So, it might be useful to have a measurement baseline like that, rather than an absolute requirement.
A functional requirement is simple to specify and aligns with standard industry practices. The alternatives get more involved and in practice won't provide much of any value over the functional requirement (for server-class systems). The argument is why are you requiring ecc correction on this - and not the register file, or CSRs?
This is a baseline requirement - aligned with common/dominant industry practice. Conversely it is not a dominant industry practice to protect flop-based register files (or flop-based storage structures in general). (Latch-based register files, depending on whether the bitcell is more SRAM-like or flop-like, fall in one category or the other.) The reason is they're small enough that failures are unlikely - and that's what your rationale should be stated.
Nowadays even the aggregate error rate or MTBF due to flop soft errors is not small. But thankfully for most designs that MTBF component is acceptable within typical MTBF budgets. As far as instead specifying an MTBF requirement, one then gets into system-wide issues and overall MTBF budgets, where it gets spent, what about the technology dependence of all this, and .... Plus that effectively would provide little guidance to CPU designers as to what is their individual MTBF budget. Or, conversely, one can probably have long discussions/arguments about what is the right MTBF number to require at the level of a single CPU core. But at the end of the day very few or virtually no customer of a server-class system is going to accept a product that doesn't even have single-bit error protection on the cache hierarchy.
|
|

Allen Baum
Good answers all around; I didn't pick up on the difference between OS-A base and OS-A server difference.
It makes sense in hindsight for the manufacturers to set the MTBF goal and design to meet it. I was concerned that this could be met without the complexity of SB L!D caches, but if those a typical base platforms instead of server platforms, anyway, its not a significant concern in any case.
toggle quoted message
Show quoted text
On Thu, Jun 17, 2021 at 12:01 PM Kumar Sankaran < ksankaran@...> wrote: To add to what Greg mentioned below, the RAS features as mentioned in the patch is required only for the OS-A platform server extension. We are not mandating any RAS requirements for the OS-A base platform compatibility. Regards Kumar Is it acceptable to everyone that all single bit errors on all caches must be correctable?
Nowadays single-bit errors are far from rare. There will always be people that run Linux and are willing to accept occasional silent corruptions and whatever mysterious application/data corruptions occur as a result. But for a standardized server-class platform spec, this is a rather low "table stakes" bar to set. Virtually no customer of a "server-class" platform will be comfortable without that (especially since the x86 and ARM alternatives provide at least that). That really affects designs in fundamental ways for L1 caches (as opposed to simply detecting).
Parity (and invalidate on error detection) suffices for I and WT D caches; and ECC is used on WB D caches. Even L1 D caches (which is one argument for doing a WT L1 D cache with parity, but the majority of people still do WB L1 D caches with ECC). Understandably some people don't want to deal with ECC on a WB DL1, and parity or nothing may be fine for less-than server-class systems. Not as big a concern for L2 and above. Speaking from my Intel experience, the rule was expressed as failures per year - and if an L1 cache was small enough to exceed that number, then it didn't need correction.
Somewhat analogous, TSMC imposes similarly expressed requirements wrt having redundancy in all the RAMs. Even just one non-redundant 64 KiB cache can pretty much use up what is allowed to not have redundancy. In any case, the Base platform spec should allow people to make whatever choice they want (and live with the consequences). But to be competitive and to meet customer expectations (especially in a multi-core world), the Server spec needs to require a higher-than-nothing bar. So, it might be useful to have a measurement baseline like that, rather than an absolute requirement.
A functional requirement is simple to specify and aligns with standard industry practices. The alternatives get more involved and in practice won't provide much of any value over the functional requirement (for server-class systems). The argument is why are you requiring ecc correction on this - and not the register file, or CSRs?
This is a baseline requirement - aligned with common/dominant industry practice. Conversely it is not a dominant industry practice to protect flop-based register files (or flop-based storage structures in general). (Latch-based register files, depending on whether the bitcell is more SRAM-like or flop-like, fall in one category or the other.) The reason is they're small enough that failures are unlikely - and that's what your rationale should be stated.
Nowadays even the aggregate error rate or MTBF due to flop soft errors is not small. But thankfully for most designs that MTBF component is acceptable within typical MTBF budgets. As far as instead specifying an MTBF requirement, one then gets into system-wide issues and overall MTBF budgets, where it gets spent, what about the technology dependence of all this, and .... Plus that effectively would provide little guidance to CPU designers as to what is their individual MTBF budget. Or, conversely, one can probably have long discussions/arguments about what is the right MTBF number to require at the level of a single CPU core. But at the end of the day very few or virtually no customer of a server-class system is going to accept a product that doesn't even have single-bit error protection on the cache hierarchy.
|
|

Abner Chang
- The platform should provide the capability to configure each RAS error to trigger firmware-first
or
OS-first error interrupt.
- If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.
I would have thought that this is just a software issue. What kind of hardware mechanism do you picture being needed?
That could be, - If RAS error triggers M-mode (FFM) and firmware decides to expose the error to OS (could be configured through CSR or RAS registers), then the RAS
OS interrupt can be triggered when the system exits M-mode.
- or If RAS error
triggers Management mode in TEE, then
the RAS OS interrupt to can be triggered when the system exits TEE. The knob of exposing RAS errors to OS could go with each RAS error configuration register or just one centralized RAS register or CSR for all RAS errors. Suppose the event to bring the system to TEE has the most priority even the system is executing in M-Mode. This makes sure firmware can address the RAS error immediately when it happens in any privilege. - Each RAS error should be able to mask through RAS configuration registers.
By "mask" do you mean masking of generation of an error interrupt?
Yes, to mask the RAS error interrupt or even not to create the log (in RAS status registers or CSR) that OEM doesn't consider that is a useful or important error to product. - We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.
Wouldn't the TEE be running in M-mode? Or where is it expected to be running?
yes,TEE is be running in M-mode if the memory serves me right from the spec. My expectation of TEE is there would be an event that can be triggered by either hardware or software to bring the system to TEE no matter which mode the HART is currently running, I am not sure if this is how TEE would be implemented. For PCIe RAS,
- The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.
In x86 and ARM platforms, doesn't the OS pretty much always handle PCIe AER errors (i.e. OS-first for this class of errors)? (I was reading an Intel overview doc recently that essentially said that - irrespective of whether other classes of errors are OS-first or firmware-first).)
Besides correcting the error in firmware, firmware also logs the necessary PCIe error events to BMC before OS handling that. The firmware RAS logs are retrieved in out-of-band even the system is shut down or the OS crashes. This increases the diagnosability and decreases the cost of customer service in the field.
Abner
Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART
locally?
Definitely there will be processor/hart errors. Presumably each hart would output one or more RAS interrupt request signals.
Greg
|
|
- The platform should provide the capability to configure each RAS error to trigger firmware-first
or
OS-first error interrupt.
- If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.
I would have thought that this is just a software issue. What kind of hardware mechanism do you picture being needed?
That could be, - If RAS error triggers M-mode (FFM) and firmware decides to expose the error to OS (could be configured through CSR or RAS registers), then the RAS
OS interrupt can be triggered when the system exits M-mode.
- or If RAS error
triggers Management mode in TEE, then
the RAS OS interrupt to can be triggered when the system exits TEE. The knob of exposing RAS errors to OS could go with each RAS error configuration register or just one centralized RAS register or CSR for all RAS errors. Suppose the event to bring the system to TEE has the most priority even the system is executing in M-Mode. This makes sure firmware can address the RAS error immediately when it happens in any privilege.
Thanks. This does seem to be all a matter of software configuring and handling things appropriately.
- We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.
Wouldn't the TEE be running in M-mode? Or where is it expected to be running?
yes,TEE is be running in M-mode if the memory serves me right from the spec. My expectation of TEE is there would be an event that can be triggered by either hardware or software to bring the system to TEE no matter which mode the HART is currently running, I am not sure if this is how TEE would be implemented.
Then this just becomes a matter of software configuring the interrupt controller to direct a given interrupt source to a given privilege mode.
For PCIe RAS,
- The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.
In x86 and ARM platforms, doesn't the OS pretty much always handle PCIe AER errors (i.e. OS-first for this class of errors)? (I was reading an Intel overview doc recently that essentially said that - irrespective of whether other classes of errors are OS-first or firmware-first).)
Besides correcting the error in firmware, firmware also logs the necessary PCIe error events to BMC before OS handling that. The firmware RAS logs are retrieved in out-of-band even the system is shut down or the OS crashes. This increases the diagnosability and decreases the cost of customer service in the field.
Just fyi, this paper discusses use of both models in the x86 world: a-tour-beyond-bios-implementing-the-acpi-platform-error-interface-with-the-uefi. As a number of us will remember from the ARMv8 days, there were big (as in religious) arguments over which model was the right one to adopt. Ultimately it was accepted that both need to be supported by the architecture. The point being that the OS/A platform spec should support both and not presume one as the one and only answer. Greg
|
|

Kumar Sankaran
Greg - Do you have any further comments/responses to Abner's comments below? Abner - my comments inline below. On Fri, Jun 18, 2021 at 9:01 AM Abner Chang <renba.chang@...> wrote:
Greg Favor <gfavor@...> 於 2021年6月18日 週五 上午2:03寫道:
On Thu, Jun 17, 2021 at 8:56 AM Abner Chang <renba.chang@...> wrote:
- The platform should provide the capability to configure each RAS error to trigger firmware-first or OS-first error interrupt. - If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.
I would have thought that this is just a software issue. What kind of hardware mechanism do you picture being needed? That could be, - If RAS error triggers M-mode (FFM) and firmware decides to expose the error to OS (could be configured through CSR or RAS registers), then the RAS OS interrupt can be triggered when the system exits M-mode. - or If RAS error triggers Management mode in TEE, then the RAS OS interrupt to can be triggered when the system exits TEE. The knob of exposing RAS errors to OS could go with each RAS error configuration register or just one centralized RAS register or CSR for all RAS errors. Suppose the event to bring the system to TEE has the most priority even the system is executing in M-Mode. This makes sure firmware can address the RAS error immediately when it happens in any privilege.
I think the primary requirements here are the following: - The platform should provide the capability to configure each RAS error to trigger firmware-first or OS-first error interrupt. - If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. Is there a need to provide all the other details?
- Each RAS error should be able to mask through RAS configuration registers.
By "mask" do you mean masking of generation of an error interrupt? Yes, to mask the RAS error interrupt or even not to create the log (in RAS status registers or CSR) that OEM doesn't consider that is a useful or important error to product.
This is fine - We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.
Wouldn't the TEE be running in M-mode? Or where is it expected to be running? yes,TEE is be running in M-mode if the memory serves me right from the spec. My expectation of TEE is there would be an event that can be triggered by either hardware or software to bring the system to TEE no matter which mode the HART is currently running, I am not sure if this is how TEE would be implemented.
Can we summarize the requirement to - RAS errors should be capable of interrupting TEE. For PCIe RAS, - The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.
In x86 and ARM platforms, doesn't the OS pretty much always handle PCIe AER errors (i.e. OS-first for this class of errors)? (I was reading an Intel overview doc recently that essentially said that - irrespective of whether other classes of errors are OS-first or firmware-first).) Besides correcting the error in firmware, firmware also logs the necessary PCIe error events to BMC before OS handling that. The firmware RAS logs are retrieved in out-of-band even the system is shut down or the OS crashes. This increases the diagnosability and decreases the cost of customer service in the field.
Abner
The PCIe AER errors have been handled OS first on X86 systems. If I recall correct, ARM64 initially made PCIe AER errors firmware first and then later changed to OS first to be compliant with what's already out there. The exact manner of handling these PCIe AER errors is also OEM dependent. Some OEMs will handle it OS first while making a call to the firmware to take additional corrective action of notifying the BMC and such. Some ARM64 implementations handle this firmware first and notify the BMC and then notify the OS. From a RISC-V platforms requirements perspective, my suggestion is we simply mention the capability of all errors to have support for firmware first and OS first and leave it at that.
Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART locally? Definitely there will be processor/hart errors. Presumably each hart would output one or more RAS interrupt request signals.
Greg
Yes, there will be more RAS errors. For the initial spec, we are only making the bare minimal set of RAS features mandatory for the server extension for 2022. We can add more RAS features as things solidify. -- Regards Kumar
|
|
On Tue, Jun 22, 2021 at 5:34 PM Kumar Sankaran < ksankaran@...> wrote: I think the primary requirements here are the following:
- The platform should provide the capability to configure each RAS
error to trigger firmware-first or OS-first error interrupt.
Yes. Which is just a software matter of configuring the interrupt controller accordingly.
- If the RAS error is handled by firmware, the firmware should be able
to choose to expose the error to S/HS mode for further processes or
just hide the error from S/HS software.
Is there a need to provide all the other details?
Agreed. The details and mechanics don't need to be discussed (unless they are mandating specific mechanics - which I don't believe is the case).
> Yes, to mask the RAS error interrupt or even not to create the log (in RAS status registers or CSR) that OEM doesn't consider that is a useful or important error to product.
This is fine
Maybe just say that "Logging and/or reporting of errors can be masked". Can we summarize the requirement to
- RAS errors should be capable of interrupting TEE.
This implies a requirement to have a TEE - and defining what constitutes a compliant TEE in the platform spec. Btw, what distinguishes the TEE from "firmware"? The PCIe AER errors have been handled OS first on X86 systems. If I
recall correct, ARM64 initially made PCIe AER errors firmware first
and then later changed to OS first to be compliant with what's already
out there.
The exact manner of handling these PCIe AER errors is also OEM
dependent. Some OEMs will handle it OS first while making a call to
the firmware to take additional corrective action of notifying the BMC
and such. Some ARM64 implementations handle this firmware first and
notify the BMC and then notify the OS.
From a RISC-V platforms requirements perspective, my suggestion is we
simply mention the capability of all errors to have support for
firmware first and OS first and leave it at that.
Agreed all around.
Greg
|
|

Abner Chang
On Tue, Jun 22, 2021 at 5:34 PM Kumar Sankaran < ksankaran@...> wrote: I think the primary requirements here are the following:
- The platform should provide the capability to configure each RAS
error to trigger firmware-first or OS-first error interrupt.
Agreed.
Yes. Which is just a software matter of configuring the interrupt controller accordingly.
Does this mean the interrupt controller would integrate all RAS events (HART, PCI, I/O, memory and etc.)? Or there would be a separate hardware box that manages all RAS error events, and maybe some error signals output from that box and connected to the interrupt controller? The interrupt controller just provides the mechanism to morph those error signals to FFM or OSF interrupt?
- If the RAS error is handled by firmware, the firmware should be able
to choose to expose the error to S/HS mode for further processes or
just hide the error from S/HS software.
Is there a need to provide all the other details?
Agreed. The details and mechanics don't need to be discussed (unless they are mandating specific mechanics - which I don't believe is the case).
Agreed.
> Yes, to mask the RAS error interrupt or even not to create the log (in RAS status registers or CSR) that OEM doesn't consider that is a useful or important error to product.
This is fine
Maybe just say that "Logging and/or reporting of errors can be masked".
Agreed.
Can we summarize the requirement to
- RAS errors should be capable of interrupting TEE.
This is ok for now because there is no hardware signal defined for triggering TEE right? I have more comments on this below.
This implies a requirement to have a TEE - and defining what constitutes a compliant TEE in the platform spec. Btw, what distinguishes the TEE from "firmware"?
Please correct me on ARM part if I am wrong. The equivalent mechanism to TEE is SMM on X86 and TZ on ARM. I don't quite understand how ARM TZ works, however on X86 system, all cores are brought to SMM environment when SMI is triggered. ARM has the equivalent event which is SMC, right? The above is called management mode (MM) which is defined in UEFI PI spec. MM has the highest privilege than CR0 on X86 and EL3 on ARM. The MM is OS agnostic and the MM event halts any processes and gets the core into management mode to run the firmware code. The environment of MM (data and code) can only be accessed when the core is in MM. Firmware always uses this for the secure stuff, power management, and of course the RAS.
I would like to add one more thing to the RAS requirement but I don't know how to describe it properly because seems we don't have the MM event on RISC-V such as SMI and SMC which can bring the system to MM. So there are two scenarios for RAS on the firmware first model. - If the platform doesn't have TEE and the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger M-Mode exception to all harts in the physical processor. This prevents the subsequent RAS error propagated by other harts that access the problematic hardware (PCI, memory and etc.)
- If the platform has TEE and
the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger TEE event to all harts in the physical processor and bring all harts into TEE. This prevents the subsequent RAS error propagated by other cores which access the problematic hardware (PCI, memory and etc.)
The PCIe AER errors have been handled OS first on X86 systems. If I
recall correct, ARM64 initially made PCIe AER errors firmware first
and then later changed to OS first to be compliant with what's already
out there.
The exact manner of handling these PCIe AER errors is also OEM
dependent. Some OEMs will handle it OS first while making a call to
the firmware to take additional corrective action of notifying the BMC
and such. Some ARM64 implementations handle this firmware first and
notify the BMC and then notify the OS.
From a RISC-V platforms requirements perspective, my suggestion is we
simply mention the capability of all errors to have support for
firmware first and OS first and leave it at that.
Agreed all around.
Agreed.
Abner
|
|
Yes. Which is just a software matter of configuring the interrupt controller accordingly.
Does this mean the interrupt controller would integrate all RAS events (HART, PCI, I/O, memory and etc.)? Or there would be a separate hardware box that manages all RAS error events, and maybe some error signals output from that box and connected to the interrupt controller? The interrupt controller just provides the mechanism to morph those error signals to FFM or OSF interrupt?
To the extent that "RAS interrupts" are literally that, i.e. interrupt request signals, then they go to the system interrupt controller just like all other interrupt request signals. (Some system designs might also have a "platform microcontroller" that has its own local interrupt controller and may receive some of these interrupt request signals.)
Maybe part of what you're trying to get at is that RAS error events in many architectures get logged in and reported from hardware RAS registers. RAS registers "report" errors by outputting RAS interrupt request signals. Software then comes back around and reads the RAS registers to gather info about logged errors. Can we summarize the requirement to
- RAS errors should be capable of interrupting TEE.
This is ok for now because there is no hardware signal defined for triggering TEE right? I have more comments on this below.
I expect RV will have similarities to ARM in this matter - and ARM doesn't have a hardware signal defined for triggering TEE either (and hasn't felt the need to define such).
This implies a requirement to have a TEE - and defining what constitutes a compliant TEE in the platform spec. Btw, what distinguishes the TEE from "firmware"?
Please correct me on ARM part if I am wrong. The equivalent mechanism to TEE is SMM on X86 and TZ on ARM. I don't quite understand how ARM TZ works, however on X86 system, all cores are brought to SMM environment when SMI is triggered. ARM has the equivalent event which is SMC, right?
Neither ARM nor RISC-V has a direct equivalent of SMM. So I'll pick on what ARM has - which is rather like RV. At a hardware level ARM has EL3 and Secure ELx, and RV as M-mode and secure partitions of S/U-mode (using PMP). At a software level one has a Secure monitor running in EL3/M-mode and tbd whether other parts run in SELx/partitions. TZ as a TEE is a combination of these hardware features and the secure software that runs on it. ARM TZ doesn't specify the actual software TEE, it just provides the hardware architectural features and framework for creating and running a TEE. There is no one standard ARM TEE (although ARM has developed their ATF as a reference secure boot flow; although maybe it has expanded in scope in recent years?).
In short, RV first needs to define, develop, and specify a software TEE. The hardware components are falling into place (e.g. PMP, ePMP, Zkr), and OpenSBI is working towards supporting secure partitions. So, until there is a concrete RISC-V TEE standard (or even a standard framework), we shouldn't be stating requirements tied with having a TEE. Also keep in mind that things like secure boot will be required in the Server extension - which is part of the overall topic of TEE. The above is called management mode (MM) which is defined in UEFI PI spec. MM has the highest privilege than CR0 on X86 and EL3 on ARM. The MM is OS agnostic and the MM event halts any processes and gets the core into management mode to run the firmware code. The environment of MM (data and code) can only be accessed when the core is in MM. Firmware always uses this for the secure stuff, power management, and of course the RAS.
What you describe, for RV, is M-mode - a pretty direct analog of ARM EL3.
I would like to add one more thing to the RAS requirement but I don't know how to describe it properly because seems we don't have the MM event on RISC-V such as SMI and SMC which can bring the system to MM.
RV has ECALL, just like ARM has SMC. So there are two scenarios for RAS on the firmware first model. - If the platform doesn't have TEE and the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger M-Mode exception to all harts in the physical processor. This prevents the subsequent RAS error propagated by other harts that access the problematic hardware (PCI, memory and etc.)
- If the platform has TEE and
the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger TEE event to all harts in the physical processor and bring all harts into TEE. This prevents the subsequent RAS error propagated by other cores which access the problematic hardware (PCI, memory and etc.)
I think part of what complicates this discussion is the nebulous nature of what exactly is the "TEE" in any given architecture. At a hardware level x86/ARM/RV have SMM/EL3/M-mode and they have ways to "call" into that secure environment. The software TEE architecture is what is rather nebulous. There isn't a standard software TEE architecture for x86; RV doesn't have something (yet), and ARM has just ATF (which one may or may not fully equate to being a "TEE").
Greg
|
|

Kumar Sankaran
On Wed, Jun 23, 2021 at 9:00 AM Greg Favor <gfavor@...> wrote: On Wed, Jun 23, 2021 at 7:59 AM Abner Chang <renba.chang@...> wrote:
Yes. Which is just a software matter of configuring the interrupt controller accordingly. Does this mean the interrupt controller would integrate all RAS events (HART, PCI, I/O, memory and etc.)? Or there would be a separate hardware box that manages all RAS error events, and maybe some error signals output from that box and connected to the interrupt controller? The interrupt controller just provides the mechanism to morph those error signals to FFM or OSF interrupt?
To the extent that "RAS interrupts" are literally that, i.e. interrupt request signals, then they go to the system interrupt controller just like all other interrupt request signals. (Some system designs might also have a "platform microcontroller" that has its own local interrupt controller and may receive some of these interrupt request signals.)
Maybe part of what you're trying to get at is that RAS error events in many architectures get logged in and reported from hardware RAS registers. RAS registers "report" errors by outputting RAS interrupt request signals. Software then comes back around and reads the RAS registers to gather info about logged errors.
Can we summarize the requirement to
- RAS errors should be capable of interrupting TEE.
This is ok for now because there is no hardware signal defined for triggering TEE right? I have more comments on this below.
I expect RV will have similarities to ARM in this matter - and ARM doesn't have a hardware signal defined for triggering TEE either (and hasn't felt the need to define such).
This implies a requirement to have a TEE - and defining what constitutes a compliant TEE in the platform spec. Btw, what distinguishes the TEE from "firmware"? Please correct me on ARM part if I am wrong. The equivalent mechanism to TEE is SMM on X86 and TZ on ARM. I don't quite understand how ARM TZ works, however on X86 system, all cores are brought to SMM environment when SMI is triggered. ARM has the equivalent event which is SMC, right?
Neither ARM nor RISC-V has a direct equivalent of SMM. So I'll pick on what ARM has - which is rather like RV. At a hardware level ARM has EL3 and Secure ELx, and RV as M-mode and secure partitions of S/U-mode (using PMP). At a software level one has a Secure monitor running in EL3/M-mode and tbd whether other parts run in SELx/partitions. TZ as a TEE is a combination of these hardware features and the secure software that runs on it. ARM TZ doesn't specify the actual software TEE, it just provides the hardware architectural features and framework for creating and running a TEE. There is no one standard ARM TEE (although ARM has developed their ATF as a reference secure boot flow; although maybe it has expanded in scope in recent years?).
In short, RV first needs to define, develop, and specify a software TEE. The hardware components are falling into place (e.g. PMP, ePMP, Zkr), and OpenSBI is working towards supporting secure partitions. So, until there is a concrete RISC-V TEE standard (or even a standard framework), we shouldn't be stating requirements tied with having a TEE. Also keep in mind that things like secure boot will be required in the Server extension - which is part of the overall topic of TEE.
The above is called management mode (MM) which is defined in UEFI PI spec. MM has the highest privilege than CR0 on X86 and EL3 on ARM. The MM is OS agnostic and the MM event halts any processes and gets the core into management mode to run the firmware code. The environment of MM (data and code) can only be accessed when the core is in MM. Firmware always uses this for the secure stuff, power management, and of course the RAS.
What you describe, for RV, is M-mode - a pretty direct analog of ARM EL3.
I would like to add one more thing to the RAS requirement but I don't know how to describe it properly because seems we don't have the MM event on RISC-V such as SMI and SMC which can bring the system to MM.
RV has ECALL, just like ARM has SMC.
So there are two scenarios for RAS on the firmware first model. - If the platform doesn't have TEE and the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger M-Mode exception to all harts in the physical processor. This prevents the subsequent RAS error propagated by other harts that access the problematic hardware (PCI, memory and etc.)
- If the platform has TEE and the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger TEE event to all harts in the physical processor and bring all harts into TEE. This prevents the subsequent RAS error propagated by other cores which access the problematic hardware (PCI, memory and etc.)
I think part of what complicates this discussion is the nebulous nature of what exactly is the "TEE" in any given architecture. At a hardware level x86/ARM/RV have SMM/EL3/M-mode and they have ways to "call" into that secure environment. The software TEE architecture is what is rather nebulous. There isn't a standard software TEE architecture for x86; RV doesn't have something (yet), and ARM has just ATF (which one may or may not fully equate to being a "TEE").
Greg
Given where we are currently with the lack of a proper definition for TEE, I suggest we simply remove the requirement for TEE for now and add it later when the TEE spec is finalized. Suggest we remove the line "RAS errors should be capable of interrupting TEE" and leave it at that. -- Regards Kumar
|
|

Abner Chang
Greg Favor < gfavor@...> 於 2021年6月24日 週四 上午12:00寫道: Yes. Which is just a software matter of configuring the interrupt controller accordingly.
Does this mean the interrupt controller would integrate all RAS events (HART, PCI, I/O, memory and etc.)? Or there would be a separate hardware box that manages all RAS error events, and maybe some error signals output from that box and connected to the interrupt controller? The interrupt controller just provides the mechanism to morph those error signals to FFM or OSF interrupt?
To the extent that "RAS interrupts" are literally that, i.e. interrupt request signals, then they go to the system interrupt controller just like all other interrupt request signals. (Some system designs might also have a "platform microcontroller" that has its own local interrupt controller and may receive some of these interrupt request signals.)
Maybe part of what you're trying to get at is that RAS error events in many architectures get logged in and reported from hardware RAS registers. RAS registers "report" errors by outputting RAS interrupt request signals. Software then comes back around and reads the RAS registers to gather info about logged errors.
Yes, something likes that.
Do we need to define what is the RAS error signals output to the interrupt controller? (The signal could be classified by
the error severities such as CE, UC_FATAL, UC_NONFATAL or classified by the RAS error categories such as RAS_MEM_ERROR, RAS_IO_ERROR and etc.) I think we can just leave it to RAS TG because we just define what server platform needs on RAS, right? Can we summarize the requirement to
- RAS errors should be capable of interrupting TEE.
This is ok for now because there is no hardware signal defined for triggering TEE right? I have more comments on this below.
I expect RV will have similarities to ARM in this matter - and ARM doesn't have a hardware signal defined for triggering TEE either (and hasn't felt the need to define such).
Ok, I thought there is a similar hardware signal.
Without the hardware signal to trigger TEE. The alternative would be triggering the M-mode exception and jump to TEE in the M-mode exception handler? So the scenario of triggering TEE would be, For software management mode interface: S-mode-> sbi ecall to M-mode->TEE jump vector->TEE For the hardware management mode interface: Hardware interrupt -> M-mode handler->
TEE jump vector->TEE
What firmware or software resides in TEE is implementation-specific. For example on edk2, we will load the management mode core into TEE. I am just trying to get more understanding of the future design of TEE on RV.
This implies a requirement to have a TEE - and defining what constitutes a compliant TEE in the platform spec. Btw, what distinguishes the TEE from "firmware"?
Please correct me on ARM part if I am wrong. The equivalent mechanism to TEE is SMM on X86 and TZ on ARM. I don't quite understand how ARM TZ works, however on X86 system, all cores are brought to SMM environment when SMI is triggered. ARM has the equivalent event which is SMC, right?
Neither ARM nor RISC-V has a direct equivalent of SMM. So I'll pick on what ARM has - which is rather like RV. At a hardware level ARM has EL3 and Secure ELx, and RV as M-mode and secure partitions of S/U-mode (using PMP). At a software level one has a Secure monitor running in EL3/M-mode and tbd whether other parts run in SELx/partitions. TZ as a TEE is a combination of these hardware features and the secure software that runs on it. ARM TZ doesn't specify the actual software TEE, it just provides the hardware architectural features and framework for creating and running a TEE. There is no one standard ARM TEE (although ARM has developed their ATF as a reference secure boot flow; although maybe it has expanded in scope in recent years?).
In short, RV first needs to define, develop, and specify a software TEE. The hardware components are falling into place (e.g. PMP, ePMP, Zkr), and OpenSBI is working towards supporting secure partitions. So, until there is a concrete RISC-V TEE standard (or even a standard framework), we shouldn't be stating requirements tied with having a TEE. Also keep in mind that things like secure boot will be required in the Server extension - which is part of the overall topic of TEE.
Thanks for the above explanation. The above is called management mode (MM) which is defined in UEFI PI spec. MM has the highest privilege than CR0 on X86 and EL3 on ARM. The MM is OS agnostic and the MM event halts any processes and gets the core into management mode to run the firmware code. The environment of MM (data and code) can only be accessed when the core is in MM. Firmware always uses this for the secure stuff, power management, and of course the RAS.
What you describe, for RV, is M-mode - a pretty direct analog of ARM EL3.
I would like to add one more thing to the RAS requirement but I don't know how to describe it properly because seems we don't have the MM event on RISC-V such as SMI and SMC which can bring the system to MM.
RV has ECALL, just like ARM has SMC.
Thanks for the correction. I thought SMC is the hardware signal. So there are two scenarios for RAS on the firmware first model. - If the platform doesn't have TEE and the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger M-Mode exception to all harts in the physical processor. This prevents the subsequent RAS error propagated by other harts that access the problematic hardware (PCI, memory and etc.)
- If the platform has TEE and
the hardware event to trigger TEE: If the RAS event is configured to firmware first mode, the platform should be able to trigger TEE event to all harts in the physical processor and bring all harts into TEE. This prevents the subsequent RAS error propagated by other cores which access the problematic hardware (PCI, memory and etc.)
I think part of what complicates this discussion is the nebulous nature of what exactly is the "TEE" in any given architecture. At a hardware level x86/ARM/RV have SMM/EL3/M-mode and they have ways to "call" into that secure environment. The software TEE architecture is what is rather nebulous. There isn't a standard software TEE architecture for x86; RV doesn't have something (yet), and ARM has just ATF (which one may or may not fully equate to being a "TEE").
Agreed.
|
|

Abner Chang
On Wed, Jun 23, 2021 at 9:00 AM Greg Favor <gfavor@...> wrote:
>
> On Wed, Jun 23, 2021 at 7:59 AM Abner Chang <renba.chang@...> wrote:
>>>
>>> Yes. Which is just a software matter of configuring the interrupt controller accordingly.
>>
>> Does this mean the interrupt controller would integrate all RAS events (HART, PCI, I/O, memory and etc.)?
>> Or there would be a separate hardware box that manages all RAS error events, and maybe some error signals output from that box and connected to the interrupt controller? The interrupt controller just provides the mechanism to morph those error signals to FFM or OSF interrupt?
>
>
> To the extent that "RAS interrupts" are literally that, i.e. interrupt request signals, then they go to the system interrupt controller just like all other interrupt request signals. (Some system designs might also have a "platform microcontroller" that has its own local interrupt controller and may receive some of these interrupt request signals.)
>
> Maybe part of what you're trying to get at is that RAS error events in many architectures get logged in and reported from hardware RAS registers. RAS registers "report" errors by outputting RAS interrupt request signals. Software then comes back around and reads the RAS registers to gather info about logged errors.
>
>>>>
>>>> Can we summarize the requirement to
>>>>
>>>> - RAS errors should be capable of interrupting TEE.
>>
>> This is ok for now because there is no hardware signal defined for triggering TEE right? I have more comments on this below.
>
>
> I expect RV will have similarities to ARM in this matter - and ARM doesn't have a hardware signal defined for triggering TEE either (and hasn't felt the need to define such).
>
>>>
>>>
>>> This implies a requirement to have a TEE - and defining what constitutes a compliant TEE in the platform spec. Btw, what distinguishes the TEE from "firmware"?
>>
>> Please correct me on ARM part if I am wrong.
>> The equivalent mechanism to TEE is SMM on X86 and TZ on ARM. I don't quite understand how ARM TZ works, however on X86 system, all cores are brought to SMM environment when SMI is triggered. ARM has the equivalent event which is SMC, right?
>
>
> Neither ARM nor RISC-V has a direct equivalent of SMM. So I'll pick on what ARM has - which is rather like RV. At a hardware level ARM has EL3 and Secure ELx, and RV as M-mode and secure partitions of S/U-mode (using PMP). At a software level one has a Secure monitor running in EL3/M-mode and tbd whether other parts run in SELx/partitions. TZ as a TEE is a combination of these hardware features and the secure software that runs on it. ARM TZ doesn't specify the actual software TEE, it just provides the hardware architectural features and framework for creating and running a TEE. There is no one standard ARM TEE (although ARM has developed their ATF as a reference secure boot flow; although maybe it has expanded in scope in recent years?).
>
> In short, RV first needs to define, develop, and specify a software TEE. The hardware components are falling into place (e.g. PMP, ePMP, Zkr), and OpenSBI is working towards supporting secure partitions. So, until there is a concrete RISC-V TEE standard (or even a standard framework), we shouldn't be stating requirements tied with having a TEE. Also keep in mind that things like secure boot will be required in the Server extension - which is part of the overall topic of TEE.
>
>>
>> The above is called management mode (MM) which is defined in UEFI PI spec. MM has the highest privilege than CR0 on X86 and EL3 on ARM. The MM is OS agnostic and the MM event halts any processes and gets the core into management mode to run the firmware code. The environment of MM (data and code) can only be accessed when the core is in MM. Firmware always uses this for the secure stuff, power management, and of course the RAS.
>
>
> What you describe, for RV, is M-mode - a pretty direct analog of ARM EL3.
>
>>
>>
>> I would like to add one more thing to the RAS requirement but I don't know how to describe it properly because seems we don't have the MM event on RISC-V such as SMI and SMC which can bring the system to MM.
>
>
> RV has ECALL, just like ARM has SMC.
>
>>
>> So there are two scenarios for RAS on the firmware first model.
>> - If the platform doesn't have TEE and the hardware event to trigger TEE:
>> If the RAS event is configured to firmware first mode, the platform should be able to trigger M-Mode exception to all harts in the physical processor. This prevents the subsequent RAS error propagated by other harts that access the problematic hardware (PCI, memory and etc.)
>>
>> - If the platform has TEE and the hardware event to trigger TEE:
>> If the RAS event is configured to firmware first mode, the platform should be able to trigger TEE event to all harts in the physical processor and bring all harts into TEE. This prevents the subsequent RAS error propagated by other cores which access the problematic hardware (PCI, memory and etc.)
>
>
> I think part of what complicates this discussion is the nebulous nature of what exactly is the "TEE" in any given architecture. At a hardware level x86/ARM/RV have SMM/EL3/M-mode and they have ways to "call" into that secure environment. The software TEE architecture is what is rather nebulous. There isn't a standard software TEE architecture for x86; RV doesn't have something (yet), and ARM has just ATF (which one may or may not fully equate to being a "TEE").
>
> Greg
>
Given where we are currently with the lack of a proper definition for
TEE, I suggest we simply remove the requirement for TEE for now and
add it later when the TEE spec is finalized.
Suggest we remove the line "RAS errors should be capable of
interrupting TEE" and leave it at that.
I agree with this Kumar.
Please review the below sentence. If the RAS event is configured as the firmware first model, the platform should be able to trigger the higest priority of M-mode
interrupt to all HARTs in the physical RV processor. This prevents the subsequent RAS errors are propagated by other HARTs that access the problematic hardware (PCIe, Memory, I/O and etc.)
One question: Besides those RAS events come from the interrupt controller, how about the HART or Memory RAS events? Are those RAS events in the scope of exception? or they would be also routed to
interrupt controller? Or we don't have to worry about this, RAS TG will have the solution?
Abner
--
Regards
Kumar
|
|
Do we need to define what is the RAS error signals output to the interrupt controller? (The signal could be classified by
the error severities such as CE, UC_FATAL, UC_NONFATAL or classified by the RAS error categories such as RAS_MEM_ERROR, RAS_IO_ERROR and etc.)
This just starts down the path of defining a small bit of a RAS architecture - which we shouldn't do without developing a full RAS architecture is developed (next year). I think we can just leave it to RAS TG because we just define what server platform needs on RAS, right?
Exactly. Without the hardware signal to trigger TEE. The alternative would be triggering the M-mode exception and jump to TEE in the M-mode exception handler?
So the scenario of triggering TEE would be, For software management mode interface: S-mode-> sbi ecall to M-mode->TEE jump vector->TEE
Effectively the same as with ARM. For the hardware management mode interface: Hardware interrupt -> M-mode handler->
TEE jump vector->TEE
What firmware or software resides in TEE is implementation-specific. For example on edk2, we will load the management mode core into TEE. I am just trying to get more understanding of the future design of TEE on RV.
I think the tech-tee TG has done some pieces of things around TEE, but I'm not sure what (and certainly there isn't anything heading to ratification this year).
Greg
|
|
Please review the below sentence.
If the RAS event is configured as the firmware first model, the platform should be able to trigger the higest priority of M-mode
interrupt to all HARTs in the physical RV processor. This prevents the subsequent RAS errors are propagated by other HARTs that access the problematic hardware (PCIe, Memory, I/O and etc.)
Note that the priority of any RAS interrupts would be software configurable in the interrupt controller. Also note that there are other common techniques for preventing the propagation of errors and for isolating the impact of errors (e.g. precise hart exceptions on attempted use of corrupted data, data poisoning, I/O flow termination, ...). One question:
Besides those RAS events come from the interrupt controller,
In a typical enterprise-class RAS architecture, "error events" are logged in RAS registers, which then optionally generate RAS interrupt requests. These then go to the system interrupt controller, which prioritizes and routes requests to appropriate harts.
how about the HART or Memory RAS events?
One would typically have RAS registers (for logging and reporting errors) spread around the system, ideally at all points in
the system
where errors can be detected and at all points where corrupted data can be consumed. Are those RAS events in the scope of exception? or they would be also routed to
interrupt controller?
RAS errors generally result in RAS interrupts, but when a hart tries to consume corrupted data, the ideal RAS behavior is for the hart to take a precise exception on the load instruction that is trying to consume corrupted data. Or we don't have to worry about this, RAS TG will have the solution?
All this would be covered by a proper RAS architecture (to hopefully be developed by a TG next year).
Greg
|
|

Kumar Sankaran
Patch merged with all the changes requested. Regards Kumar
toggle quoted message
Show quoted text
From: Greg Favor <gfavor@...> Sent: Wednesday, June 23, 2021 11:02 PM To: Abner Chang <renba.chang@...> Cc: Kumar Sankaran <ksankaran@...>; tech-unixplatformspec@... Subject: Re: [RISC-V] [tech-unixplatformspec] [PATCH 1/1] RAS features for OS-A platform server extension Please review the below sentence. If the RAS event is configured as the firmware first model, the platform should be able to trigger the higest priority of M-mode interrupt to all HARTs in the physical RV processor. This prevents the subsequent RAS errors are propagated by other HARTs that access the problematic hardware (PCIe, Memory, I/O and etc.)
Note that the priority of any RAS interrupts would be software configurable in the interrupt controller. Also note that there are other common techniques for preventing the propagation of errors and for isolating the impact of errors (e.g. precise hart exceptions on attempted use of corrupted data, data poisoning, I/O flow termination, ...). Besides those RAS events come from the interrupt controller,
In a typical enterprise-class RAS architecture, "error events" are logged in RAS registers, which then optionally generate RAS interrupt requests. These then go to the system interrupt controller, which prioritizes and routes requests to appropriate harts. how about the HART or Memory RAS events?
One would typically have RAS registers (for logging and reporting errors) spread around the system, ideally at all points in the system where errors can be detected and at all points where corrupted data can be consumed. Are those RAS events in the scope of exception? or they would be also routed to interrupt controller?
RAS errors generally result in RAS interrupts, but when a hart tries to consume corrupted data, the ideal RAS behavior is for the hart to take a precise exception on the load instruction that is trying to consume corrupted data. Or we don't have to worry about this, RAS TG will have the solution?
All this would be covered by a proper RAS architecture (to hopefully be developed by a TG next year).
|
|