- [PATCH 1/1] RAS features for OS-A platform server extension
Re: [PATCH 1/1] RAS features for OS-A platform server extension
- The platform should provide the capability to configure each RAS error to trigger firmware-first
OS-first error interrupt.
- If the RAS error is handled by firmware, the firmware should be able to choose to expose the error to S/HS mode for further processes or just hide the error from S/HS software. This requires some mechanisms provided by the platform and the mechanism should be protected by M-mode.
I would have thought that this is just a software issue. What kind of hardware mechanism do you picture being needed?
- Each RAS error should be able to mask through RAS configuration registers.
By "mask" do you mean masking of generation of an error interrupt?
- We should also consider triggering RAS error interrupt to TEE which is where the firmware management mode resides.
Wouldn't the TEE be running in M-mode? Or where is it expected to be running?
For PCIe RAS,
- The baseline PCIe error or AER interrupt is able to be morphed to firmware-first interrupt before delivering to H/HS software. This gives firmware a chance to log the error, correct the error or hide the error from S/HS software according to OEM RAS policy.
In x86 and ARM platforms, doesn't the OS pretty much always handle PCIe AER errors (i.e. OS-first for this class of errors)? (I was reading an Intel overview doc recently that essentially said that - irrespective of whether other classes of errors are OS-first or firmware-first).)
Besides memory and PCIe RAS, do we have RAS errors for the processor/HART? such as IPI error or some CE/UC/UCR to HART
Definitely there will be processor/hart errors. Presumably each hart would output one or more RAS interrupt request signals.
Join firstname.lastname@example.org to automatically receive all group messages.