Re: Small tweak to Privileged spec regarding PMP management?

Greg Favor

One could argue that the current spec and the sentence in question (with or without the suggested modification), is clear in calling out when an sfence.vma is not required.  But I agree that adding a short non-normative note would avoid any chance of misunderstanding.


On Mon, Aug 10, 2020 at 12:00 AM Allen Baum <allen.baum@...> wrote:
Do you want to add more detail about the page-based virtual memory being disabled case? 
    (that some implementations may require sfence.vma, depending on whether they do XXX with their TLB)?
That would be non-normative, but will alert designers about this corner case.

On Sun, Aug 9, 2020 at 11:45 PM Greg Favor <gfavor@...> wrote:
In section 3.6.2 of the Privileged spec discussing changing PMP settings, it currently says:
"If page-based virtual memory is not implemented, or when it is disabled, memory accesses check the PMP settings synchronously, so no fence is needed."

I would like to suggest removing "or when it is disabled" and just say:
"If page-based virtual memory is not implemented, memory accesses check the PMP settings synchronously, so no fence is needed."

The motivation is that high-performance implementations that support page-based virtual memory have TLBs and want to use them to handle all fetch/load/store memory accesses as they go down load/store execution pipelines during all modes of execution - including while in M-mode.  In the case of M mode, they would effectively just be caching PMA/PMP permission/access control info (as well as identity address mappings).

For designs that implement page-based virtual memory and use their TLBs as described (which is generally true in high-performance designs), not requiring that M-mode software do an sfence.vma after a series of PMP CSR writes means that these CSR writes cannot simply be implemented as CSR writes, but instead each PMP CSR write needs to also perform a heavyweight sfence.vma operation.  This is both heavily redundant (across a series of PMP writes) and is unnatural for an aggressive o-o-o design RISC design in which an sfence.vma operation really is a very strong fencing operation as well as TLB invalidation operation.  (Put differently, a key point of RISC architecture is to simplify hardware in ways that software can easily and efficiently support.)

Given that M-mode software runs a lot of implementation-specific code (including code related with PMA and PMP management), this spec tweak allows for some implementations to simplify their hardware design and include an sfence.vma in their M-mode PMP CSR writing code (while other implementations can choose to not include an sfence.vma in their M-mode code).  But also note that all designs need to at least selectively do an sfence.vma (per section 3.6.2), so this essentially means that the M-mode code would simply always do an sfence.vma after a series of PMP writes.

Lastly note that this change is backward compatible in that software that does do an sfence.vma after PMP changes will run on "old" designs that support page-based virtual memory yet access PMA's and PMP's inline with load/store execution while in M-mode.

Any objections to this simple accomodation for high-performance CPU designs?


Join to automatically receive all group messages.