Re: Platform specification questions

Greg Favor

On Mon, Dec 13, 2021 at 11:06 AM Vedvyas Shanbhogue <ved@...> wrote:
>The current wording is the following.
>All cache structures must be protected.
>single-bit errors must be detected and corrected.
>multi-bit errors can be detected and reported.
>Platforms are free to implement more advanced features than the
>minimalistic requirements that are mandated here. So we should be OK.

Could I suggest:
"Cache structures must be protected to address the Failure-in-time (FIT) requirements. The protection mechanisms may included single-bit/multi-bit error detection and/or single/multi-bit error detection/correction schemes, replaying faulting instructions, lock-step execution, etc."

This seems like a toothless and qualitative mandate since no FIT requirements are specified.  It can be a suggestion, although it's just a qualitative suggestion.  It's essentially just saying "don't forget to consider FIT requirements".  One can imagine a hundred such reminders that factor into high-end silicon design.  Why highlight just this one?

The reference to "cache structures" is also incomplete - as well as ambiguous as to whether it refers just to caches (in the most popular sense of the word) or also to other caching structures like TLBs as well .  Most all RAM-based structures in which an error can result in functional failure, need to be protected.  Although one can take the view that the above text was just trying to express a minimum requirement that doesn't encompass all RAM-based structures.  My suggestion would be something like the following two statements:

Mandate:  At a minimum, caching structures must be protected such that single-bit errors are detected and corrected by hardware.

Recommendation:  Depending on FIT rate requirements, more advanced protection, more complete protection coverage of other structures, and/or more features may be necessary (starting with at least SECDED ECC on caching structures holding locally modified data).


Join to automatically receive all group messages.