The motivation is that high-performance implementations that support page-based virtual memory have TLBs and want to use them to handle all fetch/load/store memory accesses as they go down load/store execution pipelines during all modes of execution - including while in M-mode.
In the case of M mode, they would effectively just be caching PMA/PMP permission/access control info (as well as identity address mappings).
For designs that implement page-based virtual memory and use their TLBs as described (which is generally true in high-performance designs), not requiring that M-mode software do an sfence.vma after a series of PMP CSR writes means that these CSR writes cannot simply be implemented as CSR writes, but instead each PMP CSR write needs to also perform a heavyweight sfence.vma operation. This is both heavily redundant (across a series of PMP writes) and is unnatural for an aggressive o-o-o design RISC design in which an sfence.vma operation really is a very strong fencing operation as well as TLB invalidation operation. (Put differently, a key point of RISC architecture is to simplify hardware in ways that software can easily and efficiently support.)
Given that M-mode software runs a lot of implementation-specific code (including code related with PMA and PMP management), this spec tweak allows for some implementations to
simplify their hardware design and include an sfence.vma in their M-mode PMP CSR writing code (while other implementations can choose to not include an sfence.vma in their M-mode code). But also note that all designs need to at least selectively do an sfence.vma (per section
3.6.2), so this essentially means that the M-mode code would simply always do an sfence.vma after a series of PMP writes.
Lastly note that this change is backward compatible in that software that does do an sfence.vma after PMP changes will run on "old" designs that support page-based virtual memory yet access PMA's and PMP's inline with load/store execution while in M-mode.
Any objections to this
accomodation for high-performance CPU designs?