Date   

Re: [PATCH 1/1] RAS features for OS-A platform server extension

Kumar Sankaran
 

To add to what Greg mentioned below, the RAS features as mentioned in the patch is required only for the OS-A platform server extension. We are not mandating any RAS requirements for the OS-A base platform compatibility.

 

Regards

Kumar

From: Greg Favor <gfavor@...>
Sent: Thursday, June 17, 2021 11:54 AM
To: Allen Baum <allen.baum@...>
Cc: Abner Chang <renba.chang@...>; Kumar Sankaran <ksankaran@...>; tech-unixplatformspec@...
Subject: Re: [RISC-V] [tech-unixplatformspec] [PATCH 1/1] RAS features for OS-A platform server extension

 

On Thu, Jun 17, 2021 at 11:13 AM Allen Baum <allen.baum@...> wrote:

Is it acceptable to everyone that all single bit errors on all caches must be correctable?

 

Nowadays single-bit errors are far from rare.  There will always be people that run Linux and are willing to accept occasional silent corruptions and whatever mysterious application/data corruptions occur as a result.  But for a standardized server-class platform spec, this is a rather low "table stakes" bar to set.  Virtually no customer of a "server-class" platform will be comfortable without that (especially since the x86 and ARM alternatives provide at least that).

 

That really affects designs in fundamental ways for L1 caches (as opposed to simply detecting).

 

Parity (and invalidate on error detection) suffices for I and WT D caches; and ECC is used on WB D caches.  Even L1 D caches (which is one argument for doing a WT L1 D cache with parity, but the majority of people still do WB L1 D caches with ECC).

 

Understandably some people don't want to deal with ECC on a WB DL1, and parity or nothing may be fine for less-than server-class systems.

 

Not as big a concern for L2 and above.

Speaking from my Intel experience, the rule was expressed as failures per year - and if an L1 cache was small enough to exceed that number, then it didn't need correction.

 

Somewhat analogous, TSMC imposes similarly expressed requirements wrt having redundancy in all the RAMs.  Even just one non-redundant 64 KiB cache can pretty much use up what is allowed to not have redundancy.

 

In any case, the Base platform spec should allow people to make whatever choice they want (and live with the consequences).  But to be competitive and to meet customer expectations (especially in a multi-core world), the Server spec needs to require a higher-than-nothing bar.

 

So, it might be useful to have a measurement baseline like that, rather than an absolute requirement.

 

A functional requirement is simple to specify and aligns with standard industry practices.  The alternatives get more involved and in practice won't provide much of any value over the functional requirement (for server-class systems).

 

The argument is why are you requiring ecc correction on this - and not the register file, or CSRs?

 

This is a baseline requirement - aligned with common/dominant industry practice.  Conversely it is not a dominant industry practice to protect flop-based register files (or flop-based storage structures in general).  (Latch-based register files, depending on whether the bitcell is more SRAM-like or flop-like, fall in one category or the other.)

 

The reason is they're small enough that failures are unlikely - and that's what your rationale should be stated.

 

Nowadays even the aggregate error rate or MTBF due to flop soft errors is not small.  But thankfully for most designs that MTBF component is acceptable within typical MTBF budgets.

 

As far as instead specifying an MTBF requirement, one then gets into system-wide issues and overall MTBF budgets, where it gets spent, what about the technology dependence of all this, and ....  Plus that effectively would provide little guidance to CPU designers as to what is their individual MTBF budget.  Or, conversely, one can probably have long discussions/arguments about what is the right MTBF number to require at the level of a single CPU core.

 

But at the end of the day very few or virtually no customer of a server-class system is going to accept a product that doesn't even have single-bit error protection on the cache hierarchy.

 

Greg

 


Re: [PATCH 1/1] RAS features for OS-A platform server extension

Greg Favor
 

On Thu, Jun 17, 2021 at 11:13 AM Allen Baum <allen.baum@...> wrote:
Is it acceptable to everyone that all single bit errors on all caches must be correctable?

Nowadays single-bit errors are far from rare.  There will always be people that run Linux and are willing to accept occasional silent corruptions and whatever mysterious application/data corruptions occur as a result.  But for a standardized server-class platform spec, this is a rather low "table stakes" bar to set.  Virtually no customer of a "server-class" platform will be comfortable without that (especially since the x86 and ARM alternatives provide at least that).
 
That really affects designs in fundamental ways for L1 caches (as opposed to simply detecting).

Parity (and invalidate on error detection) suffices for I and WT D caches; and ECC is used on WB D caches.  Even L1 D caches (which is one argument for doing a WT L1 D cache with parity, but the majority of people still do WB L1 D caches with ECC).

Understandably some people don't want to deal with ECC on a WB DL1, and parity or nothing may be fine for less-than server-class systems.
 
Not as big a concern for L2 and above.
Speaking from my Intel experience, the rule was expressed as failures per year - and if an L1 cache was small enough to exceed that number, then it didn't need correction.

Somewhat analogous, TSMC imposes similarly expressed requirements wrt having redundancy in all the RAMs.  Even just one non-redundant 64 KiB cache can pretty much use up what is allowed to not have redundancy.

In any case, the Base platform spec should allow people to make whatever choice they want (and live with the consequences).  But to be competitive and to meet customer expectations (especially in a multi-core world), the Server spec needs to require a higher-than-nothing bar.
 
So, it might be useful to have a measurement baseline like that, rather than an absolute requirement.

A functional requirement is simple to specify and aligns with standard industry practices.  The alternatives get more involved and in practice won't provide much of any value over the functional requirement (for server-class systems).

The argument is why are you requiring ecc correction on this - and not the register file, or CSRs?

This is a baseline requirement - aligned with common/dominant industry practice.  Conversely it is not a dominant industry practice to protect flop-based register files (or flop-based storage structures in general).  (Latch-based register files, depending on whether the bitcell is more SRAM-like or flop-like, fall in one category or the other.)

The reason is they're small enough that failures are unlikely - and that's what your rationale should be stated.

Nowadays even the aggregate error rate or MTBF due to flop soft errors is not small.  But thankfully for most designs that MTBF component is acceptable within typical MTBF budgets.

As far as instead specifying an MTBF requirement, one then gets into system-wide issues and overall MTBF budgets, where it gets spent, what about the technology dependence of all this, and ....  Plus that effectively would provide little guidance to CPU designers as to what is their individual MTBF budget.  Or, conversely, one can probably have long discussions/arguments about what is the right MTBF number to require at the level of a single CPU core.

But at the end of the day very few or virtually no customer of a server-class system is going to accept a product that doesn't even have single-bit error protection on the cache hierarchy.

Greg


Re: [PATCH 1/1] RAS features for OS-A platform server extension

Allen Baum
 

Is it acceptable to everyone that all single bit errors on all caches must be correctable?
That really affects designs in fundamental ways for L1 caches (as opposed to simply detecting).
Not as big a concern for L2 and above.
Speaking from my Intel experience, the rule was expressed as failures per year - and if an L1 cache was small enough to exceed that number, then it didn't need correction.
So, it might be useful to have a measurement baseline like that, rather than an absolute requirement.

The argument is why are you requiring ecc correction on this - and not the register file, or CSRs?
The reason is they're small enough that failures are unlikely - and that's what your rationale should be stated.
There will be platforms that are much more demanding (safety critical) where duplication is required, or majority voting.
I didn't think that we were talking about those application areas.



On Thu, Jun 17, 2021 at 8:56 AM Abner Chang <renba.chang@...> wrote:


Kumar Sankaran <ksankaran@...> 於 2021年6月16日 週三 上午8:17寫道:
Signed-off-by: Kumar Sankaran <ksankaran@...>
---
 riscv-platform-spec.adoc | 42 ++++++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/riscv-platform-spec.adoc b/riscv-platform-spec.adoc
index 4c356b8..d779452 100644
--- a/riscv-platform-spec.adoc
+++ b/riscv-platform-spec.adoc
@@ -19,18 +19,6 @@
 // table of contents
 toc::[]

-// document copyright and licensing information
-include::licensing.adoc[]
-
-// changelog for the document
-include::changelog.adoc[]
-
-// Introduction: describe the intent and purpose of the document
-include::introduction.adoc[]
-
-// Profiles: (NB: content from very first version)
-include::profiles.adoc[]
-
 == Introduction
 The platform specification defines a set of platforms that specify requirements
 for interoperability between software and hardware. The platform policy
@@ -68,11 +56,13 @@ The M platform has the following extensions:
 |SBI       | Supervisor Binary Interface
 |UEFI      | Unified Extensible Firmware Interface
 |ACPI      | Advanced Configuration and Power Interface
+|APEI      | ACPI Platform Error Interfaces
 |SMBIOS    | System Management Basic I/O System
 |DTS       | Devicetree source file
 |DTB       | Devicetree binary
 |RVA22     | RISC-V Application 2022
 |EE        | Execution Environment
+|OSPM      | Operating System Power Management
 |RV32GC    | RISC-V 32-bit general purpose ISA described as RV32IMAFDC.
 |RV64GC    | RISC-V 64-bit general purpose ISA described as RV64IMAFDC.
 |===
@@ -87,6 +77,7 @@ The M platform has the following extensions:
 |link:[RVA22 Specification]
                                        | TBD
 |link:https://arm-software.github.io/ebbr/[EBBR Specification]
                                        | v2.0.0-pre1
 |link:https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf[ACPI
Specification]              | v6.4
+|link:https://uefi.org/specs/ACPI/6.4/18_ACPI_Platform_Error_Interfaces/ACPI_PLatform_Error_Interfaces.html[APEI
Specification]              | v6.4
 |link:https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf[SMBIOS
Specification]    | v3.4.0
 |link:[Platform Policy]
                                        | TBD
 |===
@@ -504,6 +495,30 @@ delegate the virtual supervisor timer interrupt
to 'VS' mode.
 * IOMMU

 ==== RAS
+All the below mentioned RAS features are required for the OS-A platform server
+extension
+
+*  Main memory must be protected with SECDED-ECC +
+*  All cache structures must be protected +
+** single-bit errors must be detected and corrected +
+** multi-bit errors can be detected and reported +
+* There must be memory-mapped RAS registers associated with these protected
+structures to log detected errors with information about the type and location
+of the error +
+* The platform must support the APEI specification to convey all error
+information to OSPM +
+* Correctable errors must be reported by hardware and either be corrected or
+recovered by hardware, transparent to system operation and to software +
+* Hardware must provide status of these correctable errors via RAS registers +
+* Uncorrectable errors must be reported by the hardware via RAS error
+registers for system software to take the needed corrective action +
+* Attempted use of corrupted (uncorrectable) data must result in a precise
+exception on that instruction with a distinguishing custom exception cause
+code +
+* Errors logged in RAS registers must be able to generate an interrupt request
+to the system interrupt controller that may be directed to either M-mode or
+S/HS-mode for firmware-first versus OS-first error reporting +
+* PCIe AER capability is required +

Hi Kumar,
I would like to add something.
In order to support the OEM RAS policy,
- The platform should provide the capability to configure each RAS error to trigger firmware-first or OS-first error interrupt.
- If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.
- Each RAS error should be able to mask through RAS configuration registers.
- We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.

For PCIe RAS,
- The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.
 
Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART locally?

Regards,
Abner

 // M Platform
 == M Platform
@@ -593,6 +608,3 @@ also implement PMP support.
 When PMP is supported it is recommended to include at least 4 regions, although
 if possible more should be supported to allow more flexibility. Hardware
 implementations should aim for supporting at least 16 PMP regions.
-
-// acknowledge all of the contributors
-include::contributors.adoc[]
--
2.21.0






Re: [PATCH 1/1] RAS features for OS-A platform server extension

Greg Favor
 

On Thu, Jun 17, 2021 at 8:56 AM Abner Chang <renba.chang@...> wrote:
- The platform should provide the capability to configure each RAS error to trigger firmware-first or OS-first error interrupt.
- If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.

I would have thought that this is just a software issue.  What kind of hardware mechanism do you picture being needed?
 
- Each RAS error should be able to mask through RAS configuration registers.

By "mask" do you mean masking of generation of an error interrupt?
 
- We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.

Wouldn't the TEE be running in M-mode?  Or where is it expected to be running?
 
For PCIe RAS,
- The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.

In x86 and ARM platforms, doesn't the OS pretty much always handle PCIe AER errors (i.e. OS-first for this class of errors)?  (I was reading an Intel overview doc recently that essentially said that - irrespective of whether other classes of errors are OS-first or firmware-first).)

Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART locally?

Definitely there will be processor/hart errors.  Presumably each hart would output one or more RAS interrupt request signals.

Greg


Re: [PATCH 1/1] RAS features for OS-A platform server extension

Abner Chang
 



Kumar Sankaran <ksankaran@...> 於 2021年6月16日 週三 上午8:17寫道:
Signed-off-by: Kumar Sankaran <ksankaran@...>
---
 riscv-platform-spec.adoc | 42 ++++++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/riscv-platform-spec.adoc b/riscv-platform-spec.adoc
index 4c356b8..d779452 100644
--- a/riscv-platform-spec.adoc
+++ b/riscv-platform-spec.adoc
@@ -19,18 +19,6 @@
 // table of contents
 toc::[]

-// document copyright and licensing information
-include::licensing.adoc[]
-
-// changelog for the document
-include::changelog.adoc[]
-
-// Introduction: describe the intent and purpose of the document
-include::introduction.adoc[]
-
-// Profiles: (NB: content from very first version)
-include::profiles.adoc[]
-
 == Introduction
 The platform specification defines a set of platforms that specify requirements
 for interoperability between software and hardware. The platform policy
@@ -68,11 +56,13 @@ The M platform has the following extensions:
 |SBI       | Supervisor Binary Interface
 |UEFI      | Unified Extensible Firmware Interface
 |ACPI      | Advanced Configuration and Power Interface
+|APEI      | ACPI Platform Error Interfaces
 |SMBIOS    | System Management Basic I/O System
 |DTS       | Devicetree source file
 |DTB       | Devicetree binary
 |RVA22     | RISC-V Application 2022
 |EE        | Execution Environment
+|OSPM      | Operating System Power Management
 |RV32GC    | RISC-V 32-bit general purpose ISA described as RV32IMAFDC.
 |RV64GC    | RISC-V 64-bit general purpose ISA described as RV64IMAFDC.
 |===
@@ -87,6 +77,7 @@ The M platform has the following extensions:
 |link:[RVA22 Specification]
                                        | TBD
 |link:https://arm-software.github.io/ebbr/[EBBR Specification]
                                        | v2.0.0-pre1
 |link:https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf[ACPI
Specification]              | v6.4
+|link:https://uefi.org/specs/ACPI/6.4/18_ACPI_Platform_Error_Interfaces/ACPI_PLatform_Error_Interfaces.html[APEI
Specification]              | v6.4
 |link:https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf[SMBIOS
Specification]    | v3.4.0
 |link:[Platform Policy]
                                        | TBD
 |===
@@ -504,6 +495,30 @@ delegate the virtual supervisor timer interrupt
to 'VS' mode.
 * IOMMU

 ==== RAS
+All the below mentioned RAS features are required for the OS-A platform server
+extension
+
+*  Main memory must be protected with SECDED-ECC +
+*  All cache structures must be protected +
+** single-bit errors must be detected and corrected +
+** multi-bit errors can be detected and reported +
+* There must be memory-mapped RAS registers associated with these protected
+structures to log detected errors with information about the type and location
+of the error +
+* The platform must support the APEI specification to convey all error
+information to OSPM +
+* Correctable errors must be reported by hardware and either be corrected or
+recovered by hardware, transparent to system operation and to software +
+* Hardware must provide status of these correctable errors via RAS registers +
+* Uncorrectable errors must be reported by the hardware via RAS error
+registers for system software to take the needed corrective action +
+* Attempted use of corrupted (uncorrectable) data must result in a precise
+exception on that instruction with a distinguishing custom exception cause
+code +
+* Errors logged in RAS registers must be able to generate an interrupt request
+to the system interrupt controller that may be directed to either M-mode or
+S/HS-mode for firmware-first versus OS-first error reporting +
+* PCIe AER capability is required +

Hi Kumar,
I would like to add something.
In order to support the OEM RAS policy,
- The platform should provide the capability to configure each RAS error to trigger firmware-first or OS-first error interrupt.
- If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.
- Each RAS error should be able to mask through RAS configuration registers.
- We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.

For PCIe RAS,
- The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.
 
Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART locally?

Regards,
Abner

 // M Platform
 == M Platform
@@ -593,6 +608,3 @@ also implement PMP support.
 When PMP is supported it is recommended to include at least 4 regions, although
 if possible more should be supported to allow more flexibility. Hardware
 implementations should aim for supporting at least 16 PMP regions.
-
-// acknowledge all of the contributors
-include::contributors.adoc[]
--
2.21.0






Re: [PATCH] Add direct memory access synchronize extension

Allen Baum
 

Arch-test should be involved also.
It is (more than) a bit  complicated because CMOs are instructions that affect non-architectural bits of an implementation 
- so it's unclear what it even means to have an architectural test, much less how to write one.
The framework and tests are currently only handling deterministic archtecutral state and 
The definition of done has an architectural test component, and a proof of concept component.
The CMOs can only do the proof-of-concept part because of the above.

On Tue, Jun 15, 2021 at 4:33 PM David Kruckemyer <dkruckemyer@...> wrote:
Hi all,

My apologies as I just got wind of this discussion (I was unable to attend the last few CMO TG meetings due to travel). I think we should sync up on the CMO TG and SBI/platform efforts since there seems to be a bit of disconnect.

Regarding the CMO TG goals, we have intended to get a basic subset of operations into the profile/platform specifications for this year. The "phase 1" status is listed here:


Though honestly, a bit of this is out of date already, so expect some clarification in the coming days (just need to do some terminology cleanup).

Please do not hesitate to reach out to me with any questions (or to post questions to the the CMO TG mailing list: tech-cmo@... )

Cheers,
David


On Mon, Jun 7, 2021 at 2:35 AM Nick Kossifidis <mick@...> wrote:
Στις 2021-06-07 07:03, Anup Patel έγραψε:
>
> Let's have a simple SBI DMA sync extension in SBI v0.4 spec.
>
> The shared code pages between M-mode and S-mode will have it's own
> Challenges and we will have to define more stuff in SBI spec to support
> this (see above).
>

Totally agree with you, I just thought it'd be a good opportunity to
bring this up so that we can discuss it at some point, let's have
something that works and we can optimize it later on.

> It seems CMO extension might freeze sooner than we think (others can
> comment on this). If CMO extension is frozen by year end then we can
> trap-n-emulate CMO instructions instead of SBI DMA sync extension. If
> it does not freeze by year end then we will have to go ahead with
> SBI DMA sync extension as stop-gap solution.
>

The CMOs TG has a meeting today, I'll try and join and ask for updates
on this.






Re: [PATCH] Add direct memory access synchronize extension

David Kruckemyer
 

FWIW, our (the CMO TG's) priorities are in order as follows:

- Zicbom (maintenance)
- Zicboz (zero)
- Zicbop (prefetch)

We happen to have provisional opcodes for both Zicbom and Zicboz (mostly since they occupy the same real estate).

The primary goal now is to take our overly general spec and distill it down into the three extensions and limit it to the Phase 1 material. Volunteers to help out with that would be greatly appreciated.... :)

Cheers,
David


On Tue, Jun 15, 2021 at 7:32 PM Anup Patel <Anup.Patel@...> wrote:

Hi Paul,

 

Everyone over here is well aware of the importance of fast-tracking basic CMO instructions and getting it frozen soon. The CMO group is also aware of their priorities so we should let them tackle this instead of proposing how they should work.

 

As mentioned quite a few time in this email thread, the SBI DMA sync is only a stop-gap solution (or backup plan) to tackle Linux RISC-V patch acceptance policy if we don’t get basic CMO instructions soon. We would certainly like to avoid SBI DMA sync extension if possible. In fact, we have not included SBI DMA sync extension in the recently frozen SBI v0.3-rc1 spec which will be released next month.

 

It is certainly possible to have basic CMO instructions frozen by 2021 year end. If this happens then we will discard SBI DMA sync proposal and emulate basic CMO instructions in OpenSBI for BeagleV and Allwinner D1 boards. In fact, Atish is still figuring out ways to avoid both SBI DMA sync and CMO instructions for at least BeagleV if that is possible.

 

Regards,

Anup

 

From: Paul Walmsley <paul.walmsley@...>
Sent: 16 June 2021 05:29
To: David Kruckemyer <dkruckemyer@...>; Nick Kossifidis <mick@...>
Cc: Anup Patel <Anup.Patel@...>; Atish Patra <Atish.Patra@...>; tech-unixplatformspec@...; Palmer Dabbelt <palmerdabbelt@...>; Palmer Dabbelt <palmer@...>; tech-cmo@...; John Ingalls <john.ingalls@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] [PATCH] Add direct memory access synchronize extension

 

It would be ideal if the CMO group could focus on fast-tracking the Cache Block Maintenance Operations for Phase 1 and get opcodes assigned, and this part of the specification frozen.  The maintenance operations are mandatory for non-CPU-cache-coherent peripheral DMA to work correctly; that's why these should be completed first.   As far as I can tell, prefetch and zeroing are strictly optimizations, so it would be best if these could be delayed to a Phase 2 -- which could be developed in parallel while Phase 1 goes through the opcode committee, etc. 

 

Then the SBI sync extension should be superfluous. It would be ideal if we could avoid having multiple mechanisms for the same operations.

 

For this to work, though, the CMO group needs to move on the block maintenance instructions quickly. 

 

 

- Paul

 

 

On 6/15/21 4:33 PM, David Kruckemyer wrote:

Hi all,

 

My apologies as I just got wind of this discussion (I was unable to attend the last few CMO TG meetings due to travel). I think we should sync up on the CMO TG and SBI/platform efforts since there seems to be a bit of disconnect.

 

Regarding the CMO TG goals, we have intended to get a basic subset of operations into the profile/platform specifications for this year. The "phase 1" status is listed here:

 

 

Though honestly, a bit of this is out of date already, so expect some clarification in the coming days (just need to do some terminology cleanup).

 

Please do not hesitate to reach out to me with any questions (or to post questions to the the CMO TG mailing list: tech-cmo@... )

 

Cheers,

David

 

 

On Mon, Jun 7, 2021 at 2:35 AM Nick Kossifidis <mick@...> wrote:

Στις 2021-06-07 07:03, Anup Patel έγραψε:
>
> Let's have a simple SBI DMA sync extension in SBI v0.4 spec.
>
> The shared code pages between M-mode and S-mode will have it's own
> Challenges and we will have to define more stuff in SBI spec to support
> this (see above).
>

Totally agree with you, I just thought it'd be a good opportunity to
bring this up so that we can discuss it at some point, let's have
something that works and we can optimize it later on.

> It seems CMO extension might freeze sooner than we think (others can
> comment on this). If CMO extension is frozen by year end then we can
> trap-n-emulate CMO instructions instead of SBI DMA sync extension. If
> it does not freeze by year end then we will have to go ahead with
> SBI DMA sync extension as stop-gap solution.
>

The CMOs TG has a meeting today, I'll try and join and ask for updates
on this.





Re: [PATCH] Add direct memory access synchronize extension

Anup Patel
 

Hi Paul,

 

Everyone over here is well aware of the importance of fast-tracking basic CMO instructions and getting it frozen soon. The CMO group is also aware of their priorities so we should let them tackle this instead of proposing how they should work.

 

As mentioned quite a few time in this email thread, the SBI DMA sync is only a stop-gap solution (or backup plan) to tackle Linux RISC-V patch acceptance policy if we don’t get basic CMO instructions soon. We would certainly like to avoid SBI DMA sync extension if possible. In fact, we have not included SBI DMA sync extension in the recently frozen SBI v0.3-rc1 spec which will be released next month.

 

It is certainly possible to have basic CMO instructions frozen by 2021 year end. If this happens then we will discard SBI DMA sync proposal and emulate basic CMO instructions in OpenSBI for BeagleV and Allwinner D1 boards. In fact, Atish is still figuring out ways to avoid both SBI DMA sync and CMO instructions for at least BeagleV if that is possible.

 

Regards,

Anup

 

From: Paul Walmsley <paul.walmsley@...>
Sent: 16 June 2021 05:29
To: David Kruckemyer <dkruckemyer@...>; Nick Kossifidis <mick@...>
Cc: Anup Patel <Anup.Patel@...>; Atish Patra <Atish.Patra@...>; tech-unixplatformspec@...; Palmer Dabbelt <palmerdabbelt@...>; Palmer Dabbelt <palmer@...>; tech-cmo@...; John Ingalls <john.ingalls@...>
Subject: Re: [RISC-V] [tech-unixplatformspec] [PATCH] Add direct memory access synchronize extension

 

It would be ideal if the CMO group could focus on fast-tracking the Cache Block Maintenance Operations for Phase 1 and get opcodes assigned, and this part of the specification frozen.  The maintenance operations are mandatory for non-CPU-cache-coherent peripheral DMA to work correctly; that's why these should be completed first.   As far as I can tell, prefetch and zeroing are strictly optimizations, so it would be best if these could be delayed to a Phase 2 -- which could be developed in parallel while Phase 1 goes through the opcode committee, etc. 

 

Then the SBI sync extension should be superfluous. It would be ideal if we could avoid having multiple mechanisms for the same operations.

 

For this to work, though, the CMO group needs to move on the block maintenance instructions quickly. 

 

 

- Paul

 

 

On 6/15/21 4:33 PM, David Kruckemyer wrote:

Hi all,

 

My apologies as I just got wind of this discussion (I was unable to attend the last few CMO TG meetings due to travel). I think we should sync up on the CMO TG and SBI/platform efforts since there seems to be a bit of disconnect.

 

Regarding the CMO TG goals, we have intended to get a basic subset of operations into the profile/platform specifications for this year. The "phase 1" status is listed here:

 

 

Though honestly, a bit of this is out of date already, so expect some clarification in the coming days (just need to do some terminology cleanup).

 

Please do not hesitate to reach out to me with any questions (or to post questions to the the CMO TG mailing list: tech-cmo@... )

 

Cheers,

David

 

 

On Mon, Jun 7, 2021 at 2:35 AM Nick Kossifidis <mick@...> wrote:

Στις 2021-06-07 07:03, Anup Patel έγραψε:
>
> Let's have a simple SBI DMA sync extension in SBI v0.4 spec.
>
> The shared code pages between M-mode and S-mode will have it's own
> Challenges and we will have to define more stuff in SBI spec to support
> this (see above).
>

Totally agree with you, I just thought it'd be a good opportunity to
bring this up so that we can discuss it at some point, let's have
something that works and we can optimize it later on.

> It seems CMO extension might freeze sooner than we think (others can
> comment on this). If CMO extension is frozen by year end then we can
> trap-n-emulate CMO instructions instead of SBI DMA sync extension. If
> it does not freeze by year end then we will have to go ahead with
> SBI DMA sync extension as stop-gap solution.
>

The CMOs TG has a meeting today, I'll try and join and ask for updates
on this.





Re: PCIe requirements: Memory vs I/O

Greg Favor
 

Thanks.


On Tue, Jun 15, 2021 at 6:00 PM Josh Scheid <jscheid@...> wrote:
On Tue, Jun 15, 2021 at 5:43 PM Josh Scheid via lists.riscv.org <jscheid=ventanamicro.com@...> wrote:

I can and will do that.  The point of raising this here is to explicitly confirm that the platform intent is to enable Memory PMA within, say, PCIe-managed regions.  With that confirmation now effectively clear we can push on the priv spec.



-Josh


Re: PCIe requirements: Memory vs I/O

Josh Scheid
 

On Tue, Jun 15, 2021 at 5:43 PM Josh Scheid via lists.riscv.org <jscheid=ventanamicro.com@...> wrote:

I can and will do that.  The point of raising this here is to explicitly confirm that the platform intent is to enable Memory PMA within, say, PCIe-managed regions.  With that confirmation now effectively clear we can push on the priv spec.



-Josh


Re: PCIe requirements: Memory vs I/O

Josh Scheid
 

On Tue, Jun 15, 2021 at 9:44 AM Greg Favor <gfavor@...> wrote:
This sentence is fraught with use of a few ill-defined terms, e.g. "regular main memory" and "device scratchpad RAMs" - which for now maybe isn't worth trying to "fix".  I would suggest making a PR on the Priv spec with a proposed rewording.  For example (with a goal of not totally replacing the sentence):

Memory regions that do not fit into regular main memory, for example, device-related RAMs, may be categorized as main memory regions or I/O regions based on the desired attributes.


I can and will do that.  The point of raising this here is to explicitly confirm that the platform intent is to enable Memory PMA within, say, PCIe-managed regions.  With that confirmation now effectively clear we can push on the priv spec.

-Josh


[PATCH 1/1] RAS features for OS-A platform server extension

Kumar Sankaran
 

Signed-off-by: Kumar Sankaran <ksankaran@...>
---
riscv-platform-spec.adoc | 42 ++++++++++++++++++++++++++--------------
1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/riscv-platform-spec.adoc b/riscv-platform-spec.adoc
index 4c356b8..d779452 100644
--- a/riscv-platform-spec.adoc
+++ b/riscv-platform-spec.adoc
@@ -19,18 +19,6 @@
// table of contents
toc::[]

-// document copyright and licensing information
-include::licensing.adoc[]
-
-// changelog for the document
-include::changelog.adoc[]
-
-// Introduction: describe the intent and purpose of the document
-include::introduction.adoc[]
-
-// Profiles: (NB: content from very first version)
-include::profiles.adoc[]
-
== Introduction
The platform specification defines a set of platforms that specify requirements
for interoperability between software and hardware. The platform policy
@@ -68,11 +56,13 @@ The M platform has the following extensions:
|SBI | Supervisor Binary Interface
|UEFI | Unified Extensible Firmware Interface
|ACPI | Advanced Configuration and Power Interface
+|APEI | ACPI Platform Error Interfaces
|SMBIOS | System Management Basic I/O System
|DTS | Devicetree source file
|DTB | Devicetree binary
|RVA22 | RISC-V Application 2022
|EE | Execution Environment
+|OSPM | Operating System Power Management
|RV32GC | RISC-V 32-bit general purpose ISA described as RV32IMAFDC.
|RV64GC | RISC-V 64-bit general purpose ISA described as RV64IMAFDC.
|===
@@ -87,6 +77,7 @@ The M platform has the following extensions:
|link:[RVA22 Specification]
| TBD
|link:https://arm-software.github.io/ebbr/[EBBR Specification]
| v2.0.0-pre1
|link:https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf[ACPI
Specification] | v6.4
+|link:https://uefi.org/specs/ACPI/6.4/18_ACPI_Platform_Error_Interfaces/ACPI_PLatform_Error_Interfaces.html[APEI
Specification] | v6.4
|link:https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf[SMBIOS
Specification] | v3.4.0
|link:[Platform Policy]
| TBD
|===
@@ -504,6 +495,30 @@ delegate the virtual supervisor timer interrupt
to 'VS' mode.
* IOMMU

==== RAS
+All the below mentioned RAS features are required for the OS-A platform server
+extension
+
+* Main memory must be protected with SECDED-ECC +
+* All cache structures must be protected +
+** single-bit errors must be detected and corrected +
+** multi-bit errors can be detected and reported +
+* There must be memory-mapped RAS registers associated with these protected
+structures to log detected errors with information about the type and location
+of the error +
+* The platform must support the APEI specification to convey all error
+information to OSPM +
+* Correctable errors must be reported by hardware and either be corrected or
+recovered by hardware, transparent to system operation and to software +
+* Hardware must provide status of these correctable errors via RAS registers +
+* Uncorrectable errors must be reported by the hardware via RAS error
+registers for system software to take the needed corrective action +
+* Attempted use of corrupted (uncorrectable) data must result in a precise
+exception on that instruction with a distinguishing custom exception cause
+code +
+* Errors logged in RAS registers must be able to generate an interrupt request
+to the system interrupt controller that may be directed to either M-mode or
+S/HS-mode for firmware-first versus OS-first error reporting +
+* PCIe AER capability is required +

// M Platform
== M Platform
@@ -593,6 +608,3 @@ also implement PMP support.
When PMP is supported it is recommended to include at least 4 regions, although
if possible more should be supported to allow more flexibility. Hardware
implementations should aim for supporting at least 16 PMP regions.
-
-// acknowledge all of the contributors
-include::contributors.adoc[]
--
2.21.0


Re: [PATCH] Add direct memory access synchronize extension

Paul Walmsley
 

It would be ideal if the CMO group could focus on fast-tracking the Cache Block Maintenance Operations for Phase 1 and get opcodes assigned, and this part of the specification frozen.  The maintenance operations are mandatory for non-CPU-cache-coherent peripheral DMA to work correctly; that's why these should be completed first.   As far as I can tell, prefetch and zeroing are strictly optimizations, so it would be best if these could be delayed to a Phase 2 -- which could be developed in parallel while Phase 1 goes through the opcode committee, etc. 


Then the SBI sync extension should be superfluous. It would be ideal if we could avoid having multiple mechanisms for the same operations.


For this to work, though, the CMO group needs to move on the block maintenance instructions quickly. 



- Paul



On 6/15/21 4:33 PM, David Kruckemyer wrote:
Hi all,

My apologies as I just got wind of this discussion (I was unable to attend the last few CMO TG meetings due to travel). I think we should sync up on the CMO TG and SBI/platform efforts since there seems to be a bit of disconnect.

Regarding the CMO TG goals, we have intended to get a basic subset of operations into the profile/platform specifications for this year. The "phase 1" status is listed here:


Though honestly, a bit of this is out of date already, so expect some clarification in the coming days (just need to do some terminology cleanup).

Please do not hesitate to reach out to me with any questions (or to post questions to the the CMO TG mailing list: tech-cmo@... )

Cheers,
David


On Mon, Jun 7, 2021 at 2:35 AM Nick Kossifidis <mick@...> wrote:
Στις 2021-06-07 07:03, Anup Patel έγραψε:
>
> Let's have a simple SBI DMA sync extension in SBI v0.4 spec.
>
> The shared code pages between M-mode and S-mode will have it's own
> Challenges and we will have to define more stuff in SBI spec to support
> this (see above).
>

Totally agree with you, I just thought it'd be a good opportunity to
bring this up so that we can discuss it at some point, let's have
something that works and we can optimize it later on.

> It seems CMO extension might freeze sooner than we think (others can
> comment on this). If CMO extension is frozen by year end then we can
> trap-n-emulate CMO instructions instead of SBI DMA sync extension. If
> it does not freeze by year end then we will have to go ahead with
> SBI DMA sync extension as stop-gap solution.
>

The CMOs TG has a meeting today, I'll try and join and ask for updates
on this.






Re: [PATCH] Add direct memory access synchronize extension

David Kruckemyer
 

Hi all,

My apologies as I just got wind of this discussion (I was unable to attend the last few CMO TG meetings due to travel). I think we should sync up on the CMO TG and SBI/platform efforts since there seems to be a bit of disconnect.

Regarding the CMO TG goals, we have intended to get a basic subset of operations into the profile/platform specifications for this year. The "phase 1" status is listed here:


Though honestly, a bit of this is out of date already, so expect some clarification in the coming days (just need to do some terminology cleanup).

Please do not hesitate to reach out to me with any questions (or to post questions to the the CMO TG mailing list: tech-cmo@... )

Cheers,
David


On Mon, Jun 7, 2021 at 2:35 AM Nick Kossifidis <mick@...> wrote:
Στις 2021-06-07 07:03, Anup Patel έγραψε:
>
> Let's have a simple SBI DMA sync extension in SBI v0.4 spec.
>
> The shared code pages between M-mode and S-mode will have it's own
> Challenges and we will have to define more stuff in SBI spec to support
> this (see above).
>

Totally agree with you, I just thought it'd be a good opportunity to
bring this up so that we can discuss it at some point, let's have
something that works and we can optimize it later on.

> It seems CMO extension might freeze sooner than we think (others can
> comment on this). If CMO extension is frozen by year end then we can
> trap-n-emulate CMO instructions instead of SBI DMA sync extension. If
> it does not freeze by year end then we will have to go ahead with
> SBI DMA sync extension as stop-gap solution.
>

The CMOs TG has a meeting today, I'll try and join and ask for updates
on this.






Re: PCIe requirements: Memory vs I/O

Greg Favor
 

This sentence is fraught with use of a few ill-defined terms, e.g. "regular main memory" and "device scratchpad RAMs" - which for now maybe isn't worth trying to "fix".  I would suggest making a PR on the Priv spec with a proposed rewording.  For example (with a goal of not totally replacing the sentence):

Memory regions that do not fit into regular main memory, for example, device-related RAMs, may be categorized as main memory regions or I/O regions based on the desired attributes.

Greg






On Tue, Jun 15, 2021 at 9:11 AM Josh Scheid <jscheid@...> wrote:
On Mon, Jun 14, 2021 at 7:10 PM Greg Favor <gfavor@...> wrote:
On Mon, Jun 14, 2021 at 3:56 PM Josh Scheid <jscheid@...> wrote:
The proposal allows for prefetchable BARs to be programmed to support as I/O or Memory.  This seems to conflict with the priv spec that states:

"""
Memory regions that do not fit into regular main memory, for example, device scratchpad RAMs,
are categorized as I/O regions.
"""

This is for outbound traffic and, if one sets aside the word "I/O" in the proposed text saying "two I/O regions for mapping ..." (e.g. replacing "I/O" with "address"), then is there a conflict? 

The prefetchable BAR can be "mapped" by either a PMA "main memory" region or by a PMA "I/O" region.


The conflict is that statement in the priv. spec suggests that things like  "device scratchpad RAMs" like those that might be in PCIe land "are" I/O, in that they are not Memory.  Moving that priv spec statement to be illustrative and non-normative may be a solution.  Perhaps it's not really mean to be a restriction, but then a more obvious I/O example instead of a "device scratchpad RAM" would be better, as well as making it non-normative.

-Josh


Re: PCIe requirements: Memory vs I/O

Josh Scheid
 

On Mon, Jun 14, 2021 at 7:10 PM Greg Favor <gfavor@...> wrote:
On Mon, Jun 14, 2021 at 3:56 PM Josh Scheid <jscheid@...> wrote:
The proposal allows for prefetchable BARs to be programmed to support as I/O or Memory.  This seems to conflict with the priv spec that states:

"""
Memory regions that do not fit into regular main memory, for example, device scratchpad RAMs,
are categorized as I/O regions.
"""

This is for outbound traffic and, if one sets aside the word "I/O" in the proposed text saying "two I/O regions for mapping ..." (e.g. replacing "I/O" with "address"), then is there a conflict? 

The prefetchable BAR can be "mapped" by either a PMA "main memory" region or by a PMA "I/O" region.


The conflict is that statement in the priv. spec suggests that things like  "device scratchpad RAMs" like those that might be in PCIe land "are" I/O, in that they are not Memory.  Moving that priv spec statement to be illustrative and non-normative may be a solution.  Perhaps it's not really mean to be a restriction, but then a more obvious I/O example instead of a "device scratchpad RAM" would be better, as well as making it non-normative.

-Josh


Re: [RFC PATCH 1/1] server extension: PCIe requirements

Abner Chang
 

Hi   Mayuresh,
As I mentioned in the platform meeting, we missed the requirement for firmware. I added in below section and please rephrase it if you want.

Regards,
Abner

Mayuresh Chitale <mchitale@...> 於 2021年6月10日 週四 上午2:27寫道:
This patch adds requirements for PCIe support for the server extension

Signed-off-by: Mayuresh Chitale <mchitale@...>

Signed-off-by: Mayuresh Chitale <mchitale@...>
---
 riscv-platform-spec.adoc | 133 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 132 insertions(+), 1 deletion(-)

diff --git a/riscv-platform-spec.adoc b/riscv-platform-spec.adoc
index 4418788..9de487e 100644
--- a/riscv-platform-spec.adoc
+++ b/riscv-platform-spec.adoc
@@ -363,7 +363,138 @@ https://lists.riscv.org/g/tech-privileged/message/404[Sstc] extension.
 ** Platforms are required to delegate the supervisor timer interrupt to 'S'
 mode. If the 'H' extension is implemented then the platforms are required to
 delegate the virtual supervisor timer interrupt to 'VS' mode.
-* PCI-E
+
+===== PCIe
+Platforms are required to support PCIe
+footnote:[https://pcisig.com/specifications].Following are the requirements:
+
+====== PCIe Config Space
+* Platforms shall support access to the PCIe config space via ECAM as described
+in the PCI Express Base specification.
+* The entire config space for a single PCIe domain should be accessible via a
+single ECAM I/O region.
+* Platform firmware should implement the MCFG table to allow the operating
+systems to discover the supported PCIe domains and map the ECAM I/O region for
+each domain.
+* ECAM I/O regions shall be configured as channel 0 I/O regions.
+
+====== PCIe Memory Space
+* PCIe Outbound region +
+Platforms are required to provide atleast two I/O regions for mapping the
+memory requested by PCIe endpoints and PCIe bridges/switches through BARs.
+The first I/O region is required to be located below 4G physical address to
+map the memory requested by non-prefetchabe BARs. This region shall be
+configured as channel 0 I/O region. The second I/O region is required to be
+located above 4G physical address to map the memory requested by prefetchable
+BARs. This region may be configured as I/O region or as memory region.
+
+* PCIe Inbound region +
+For security reasons, platforms are required to provide a mechanism to
+restrict the inbound accesses over PCIe to certain specific regions in
+the address space such as the DRAM.
+
+====== PCIe Interrupts
+* Platforms shall support both INTx and MSI/MSI-x interrupts.
+* Integration with AIA +
+TBD
+
+====== PCIe I/O coherency
+Following are the requirements:
+
+* Platforms shall provide a mechanism to control the `NoSnoop` bit for any
+outbound TLP.
+* If the host bridge/root port receives a TLP which does not have `NoSnoop` bit
+set then hardware shall generate a snoop request.
+* If the host bridge/root port receives a TLP which has `NoSnoop` set then no
+hardware coherency is required. Software coherency may be required via CMOs.
+
+====== PCIe Topology
+Platforms are required to implement atleast one of the following topologies and
+the components required in that topology.
+
+[ditaa]
+....
+
+            +----------+                             +----------+
+            |   CPU    |                             |   CPU    |
+            |          |                             |          |
+            +-----|----+                             +-----|----+
+                  |                                        |
+                  |                                        |
+    +-------------|------------+             +-------------|------------+
+    |        ROOT | COMPLEX    |             |        ROOT | COMPLEX    |
+    |                          |             |                          |
+    |      +------|-------+    |             |      +------|-------+    |
+    |      |  Host Bridge |    |             |      |  Host Bridge |    |
+    |      +------|-------+    |             |      +------|-------+    |
+    |             |            |             |             |            |
+    |             | BUS 0      |             |             | BUS 0      |
+    |     |-------|------|     |             |       +-----|-------+    |
+    |     |              |     |             |       | ROOT  PORT  |    |
+    |     |              |     |             |       +-----|-------+    |
+    | +---|---+      +---|---+ |             |             |            |
+    | | RCEIP |      | RCEC  | |             |             | PCIe Link  |
+    | +-------+      +-------+ |             |             |            |
+    |                          |             +-------------|------------+
+    +--------------------------+                           |
+                                                           |  BUS 1
+    RCEIP - Root complex integrated endpoint
+    RCEC - Root complex event collector
+....
+
+* Host Bridge +
+Following are the requirements for host bridges:
+
+** Any read or write access by a hart to an ECAM I/O region shall be converted
+by the host bridge into the corresponding PCIe config read or config write
+request.
+** Any read or write access by a hart to a PCIe outbound region shall be
+forwarded by the host bridge to a BAR or prefetch/non-prefetch memory window,
+if the address falls within the region claimed by the BAR or prefetch/
+non-prefetch memory window. Otherwise the host bridge shall return an error.
+
+** Host bridge shall return all 1s in the following cases:
+*** Config read to non existent functions and devices on root bus.
+*** Config reads that receive Unsupported Request response from functions and
+devices on the root bus.
+* Root ports +
+Following are the requirements for root ports.
+** Root ports shall appear as PCI-PCI bridge to software.
+** Root ports shall implememnt all registers of Type 1 header.
+** Root ports shall implement all capabilities specified in the PCI Express
+Base specification for a root port.
+** Root ports shall forward type 1 configuration access when the bus number in
+the TLP is greater than the root port's secondary bus number and less than or
+equal to the root port's subordinate bus number.
+** Root ports shall convert type 1 configuration access to a type 0
+configuration acess when bus number in the TLP is equal to the root port's
+secondary bus number.
+** Root ports shall respond to any type 0 configuration accesses it receives.
+** Root ports shall forward memory accesses targeting its prefetch/non-prefetch
+memory windows to downstream components. If address of the transaction does not
+fall within the regions claimed by prefetch/non-prefetch memory windows then
+the root port shall generate a Unsupported Request.
+** Root port requester id or completer id shall be formed using the bdf of the
+root port.
+** The root ports shall support the CRS software visbility.
+** Root ports shall return all 1s in the following cases:
+*** Config read to non existent functions and devices on seconday bus.
+*** Config reads that receive Unsupported Request from downstream components.
+*** Config read when root port's link is down.
+** The root port shall implement the AER capability.
+
+* RCEIP +
+All the requirements for RCEIP in the PCI Express Base specification shall be implemented.
+In addition the following requirements shall be met:
+** If RCEIP is implemented then RCEC shall be implemented as well. All
+requrirements for RCEC specified in the PCI Express Base specification shall be
+implemented. RCEC is required to terminate the AER and PME messages from RCEIP.
+** If both the topologies mentioned above are supported then RCEIP and RCEC
+shall be implemented in a separate PCIe domain and shall be addressable via a
+separate ECAM I/O region.
+
+====== PCIe peer to peer transactions +
+TBD

====== PCIe Device Firmware Requirement 
PCI expansion ROM code type 3 (UEFI) image must be provided by PCIe device for OS/A
server extension platform accroding to
if that PCIe device is utilized during UEFI firmware boot process. The image stored in PCI
expansion ROM is an UEFI driver that must be compliant with https://uefi.org/specifications[UEFI specification 2.9]
14.4.2 PCI Option ROMs.


 ==== Secure Boot
 * TEE
--
2.17.1







Re: [RFC PATCH 1/1] server extension: PCIe requirements

Greg Favor
 

On Mon, Jun 14, 2021 at 5:23 PM Josh Scheid <jscheid@...> wrote:

I understand that IOPMP is not an IOMMU, but to the extent that it is a general "bus master memory protection" widget, it can be used by M-mode to ensure simple things, such as that S-mode-SW-controlled PCIe initiators can not access address regions not accessible by S-mode. 

Yes, most likely IOPMP can be used to do this.
 
For example, the platform spec could avoid mentioning the IOPMP proposal, but state that the platform is required to have a mechanism to allow M-mode SW to control (including prevent) PCIe initiator access to regions of system address space.  While remaining open to custom implementations, it's clear on the functional intent.

That would be appropriate.  And, for example, one much simpler implementation approach (than the IOPMP proposals) would be to replicate a CPU PMP block as an "IO PMP" in front of each I/O device or group of devices.

That would allow M-mode software to only have to deal with one PMP software programming model across all masters in the system.

Greg


Re: PCIe requirements: Memory vs I/O

Greg Favor
 

On Mon, Jun 14, 2021 at 3:56 PM Josh Scheid <jscheid@...> wrote:
The proposal allows for prefetchable BARs to be programmed to support as I/O or Memory.  This seems to conflict with the priv spec that states:

"""
Memory regions that do not fit into regular main memory, for example, device scratchpad RAMs,
are categorized as I/O regions.
"""

This is for outbound traffic and, if one sets aside the word "I/O" in the proposed text saying "two I/O regions for mapping ..." (e.g. replacing "I/O" with "address"), then is there a conflict? 

The prefetchable BAR can be "mapped" by either a PMA "main memory" region or by a PMA "I/O" region.

Greg


Re: [RFC PATCH 1/1] server extension: PCIe requirements

Josh Scheid
 

On Mon, Jun 14, 2021 at 4:02 PM Greg Favor <gfavor@...> wrote:
On Mon, Jun 14, 2021 at 2:28 PM Josh Scheid <jscheid@...> wrote:
+For security reasons, platforms are required to provide a mechanism to
+restrict the inbound accesses over PCIe to certain specific regions in
+the address space such as the DRAM.

While a standard IOMMU is further off, is the current opinion that the IOPMP is not in a position to be required or suggested as an implementation of the above requirement?  If not, then it's hard to check for compliance.

I'm not sure if an IOPMP could be used for this particular purpose, but more generally IOPMP is being driven by embedded people and isn't consciously thinking about functionality requirements implied by H-style virtualization, or PCIe MSIs, or other PCIe features.  In this regard IOPMP is analogous to PLIC and CLIC - and not generally suitable for OS/A platforms (and presumably is well-suited for M platforms).

I understand that IOPMP is not an IOMMU, but to the extent that it is a general "bus master memory protection" widget, it can be used by M-mode to ensure simple things, such as that S-mode-SW-controlled PCIe initiators can not access address regions not accessible by S-mode.  There's value in memory protection even without full virtualization support.  I'm questioning how vague the memory protection "requirement" should be to the extent that it ends up being usable and sufficient to provide a defined level of assurance. 

For example, the platform spec could avoid mentioning the IOPMP proposal, but state that the platform is required to have a mechanism to allow M-mode SW to control (including prevent) PCIe initiator access to regions of system address space.  While remaining open to custom implementations, it's clear on the functional intent.

-Josh

761 - 780 of 1837