[RISC-V] [tech-tee] [RISC-V] [tech-privileged] comments on PMP enhancements


mick@...
 

Hello Tariq,

On 2/14/20 1:39 PM, Tariq Kurd wrote:
The threat model Tariq brought up was about detecting a glitch attack but the glitch can also happen when setting
a rule in the first place, I don't see how this is the proper approach, PMP is not there as an anti-tampering
mechanism. If we want this as a debug feature I'm ok with it but it must be treated as such and not as a security
improvement.

It's not just for glitch protection, but also for security to control access permissions.
The software is loaded from the boot ROM and the boot ROM does not contain software to unlock the PMP entries so we are not (obviously) susceptible to a code-reuse ROP attack to reprogram the PMP, because the software to do so doesn't exist in the existing code.
If we need to lock regions to prevent such things from other software later then we need the flexibility to be able to lock the regions individually on a case by case basis.

Put simply we need orthogonal controls for locking and M-mode only access.
You don't need to do a ROP attack on BootROM to remove the rules, any
code that runs on M-mode can remove them in a few instructions. I don't
see how we can have a security control on M mode without also locking
the rules for preventing further modifications from the next boot
stages. If the next boot stage runs on S/U mode then that's not a
problem since S/U-mode-only rules are removable.

When defining security controls there are guarantees we put in place.
Let's say that we did everything correctly, both in hw and sw, up to the
point where we place the PMP rule there to prevent "other software"
running on M-mode from accessing some memory region. Can we guarantee
that this restriction will be enforced and that "other software" can't
simply bypass it ? I don't believe we can provide such a guarantee,
hence we can't claim this approach to be a security control. If the code
we are trying to restrict has direct control over the security control,
then it's not a security control, it's an ilusion of a security control.
To the contrary it may pose a security threat.

In contrast a locked PMP rule guarantees that "other software" on M mode
can't remove it, same goes for rules placed by M mode for restricting
S/U mode, since S/U mode can't modify PMP entries, or with MMU
restrictions implemented on S mode, since U mode can't modify the page
tables.

Your DMC proposal for exapmle for restricting M mode from accessing any
region not covered by a PMP rule is a security control, it's a more
strict version of MML where M mode is only restricted from executing
such a region. I'm more positive towards adding that, than allowing for
unlocked M-mode-only PMP rules, although the same can be achieved with a
locked M-mode-only rule, or with a locked PMA rule (that's even better
since it's valid for all harts).

There are multiple use cases for the PMP, some we know about but many we don't and others won't have been thought of yet - so the proposal need to be flexible. Nick has a use case in mind which is perfectly valid with MML but unfortunately it doesn't cover our use case, and it also causes problems for Nvidia as Joe told us during the meeting on Tuesday.
We had the same discussion with Joe some time ago, they wanted to
restrict their BootROM if I remember corectly so that it becomes
inaccessible after it jumps to the next boot stage. We concluded that
since this is not a per-hart protection, it's better to handle it in PMA
checker instead of PMP (and I think there was a proposal for a task
group on programmable PMAs where such solutions can be discussed). In
any case their scenario is covered by the current proposal, unless there
is a different scenario I'm not aware of.

If we can come up with specific use cases not covered by the current
proposal, and valid threat models for them, I'm all in for proposing new
security controls, what I try to prevent is to allow for weak security
controls in the name of flexibility.

As we discussed on our previous call, I'm not against such a feature, as
Andy pointed out, it is a useful debug feature and I can understand why
you may want to use it. My ojections are:

a) It should be disabled by default.
b) It should be noted that this is not a security control but a debug
feature.
c) It should be noted that this needs to be disabled after sw is done
using it (e.g. during the boot sequence), and remains disabled until
hard-reset (sticky bit).

Regards,
Nick


Mr Tariq Kurd <tariq.kurd@...>
 

The threat model Tariq brought up was about detecting a glitch attack but the glitch can also happen when setting
a rule in the first place, I don't see how this is the proper approach, PMP is not there as an anti-tampering
mechanism. If we want this as a debug feature I'm ok with it but it must be treated as such and not as a security
improvement.

It's not just for glitch protection, but also for security to control access permissions.
The software is loaded from the boot ROM and the boot ROM does not contain software to unlock the PMP entries so we are not (obviously) susceptible to a code-reuse ROP attack to reprogram the PMP, because the software to do so doesn't exist in the existing code.
If we need to lock regions to prevent such things from other software later then we need the flexibility to be able to lock the regions individually on a case by case basis.

Put simply we need orthogonal controls for locking and M-mode only access.

There are multiple use cases for the PMP, some we know about but many we don't and others won't have been thought of yet - so the proposal need to be flexible. Nick has a use case in mind which is perfectly valid with MML but unfortunately it doesn't cover our use case, and it also causes problems for Nvidia as Joe told us during the meeting on Tuesday.

Given that Andrew Waterman has dropped his objection to using an extra bit from the PMPCFG field we should allocate one.

So how about PMPCFG[5] = M, which can be read-only-zero to exactly match the current PMP spec.

If MML=0, M=0 for any PMP entry, then the entry operates exactly as the current PMP spec.
If MML=0, M=1 for any PMP entry, then the RWX permissions are applied to all modes equally (see the attachment)
If MML=1, M=? for any PMP entry, then the RWX bits operate as they do for the MML proposal, except that the truth table has M,R,W,X as inputs instead of L,R,W,X (see the attachment)

For example, in the MML proposal:
- LRWX=1001 means "M-mode locked execute only region" - and it's locked
In my modified proposal:
- MRWX=1001 means "M-mode execute only region", and the L bit states whether it is locked.

I've attached the truth table in a word document - it's not a full proposal, just showing how to program the permissions.

I think this is a good simplification also for when MML=0 as it gives more flexibility than the current proposal.

So in summary this updated proposal gives us
1. M mode has all permissions and S/U modes have restricted permissions (unlocked) - M=0, MML=0, L=0
2. M mode and S/U modes all have the same permissions (locked) - M=0, MML=0, L=1
3. M mode and S/U modes all have the same permissions (unlocked) - M=1, MML=0, L=0
4. S/U mode permissions with no M-mode access (unlocked) - M=0, MML=1, L=0
5. S/U mode permissions with no M-mode access (locked) - M=0, MML=1, L=1
6. Shared regions (unlocked) - M=0, MML=1, L=0
7. Shared regions (locked) - M=0, MML=1, L=1

The existing standard gives us 1 and 2, the other modes are all new

Tariq

-----Original Message-----
From: tech-tee@... [mailto:tech-tee@...] On Behalf Of Nick Kossifidis
Sent: 14 February 2020 02:40
To: John Hauser <jh.riscv@...>
Cc: Nick Kossifidis <mick@...>; tech-tee@...; tech-privileged@...
Subject: Re: [RISC-V] [tech-tee] [RISC-V] [tech-privileged] comments on PMP enhancements

Στις 2020-02-13 22:30, John Hauser έγραψε:
Nick Kossifidis wrote:
The new mechanism (when MML is set) introduces a barrier between S/U
mode and M mode, We want to be able to distinguish between an access
fault due to crossing that barrier, from other access faults. In
other words if M mode gets an access fault on its own memory we'll
get an access fault as in the current spec, if it gets an access
fault on memory that's marked for S/U use (see truth table) we'll get
a security exception. The reason is that we may want to handle this
differently in sw and it also helps in debugging.
I'm sorry to say, providing information to a debugger is not usually
considered a valid reason for additional RISC-V exception codes when
the same information can be extracted from elsewhere. If it were,
RISC-V would have dozens more exception codes than it does. A
debugger is assumed to be able to examine the PMP table itself, if
necessary, to learn more about the cause of a fault.

Your reason that "we may want to handle this differently in software"
is no more specific than before. I see your desire to be as helpful
as possible to software (and debugging), but please understand, this
sort of "might be useful" argument for additional exception codes has
already been rejected many times before. To make a better case, you
need at a minimum a compelling example that requires different
handling, and probably one where speed matters (so the software can't
just examine the PMP table to separate the cases itself).
S/U mode doesn't have access to PMP registers so it's not possible to distinguish between an access fault e.g. due to a bug on an application / driver, from an access fault due to M mode trying to access one of the S/U-mode-only regions, or from an application / driver trying to access an M-mode-only region. It may be possible to recover from such a bug e.g. by restarting the application but handling / recovering from such a security violation is a different thing and usually involves different reporting and possibly running the system in a kind of "safe mode" with certain features disabled.

mstatus.MXR is not related to PMP, it's related to virtual memory
permissions and is outside PMP's scope, the scenario you mention
involves using mstatus.MPRV to access the region with S/U privileges
(and virtual memory in place). That's still possible because the
access in this case happens as S/U mode (not as M mode) and so the
S/U mode PMP rules apply.
The case that needs to be dealt with is an S/U-mode-only region that
is execute-only, without read permission. (Please see my correction
in another message.) In such a case, M mode has the authority to
temporarily reprogram the PMP entry to grant read permission to S/U
mode, then perform a read with MPRV = 1, and lastly restore the PMP
entry to execute-only. If address translation is active, this
actually requires M mode first walk the page tables to translate the
virtual address into a physical address before searching the PMP table
to find the relevant PMP entry. But there's no reason for us to make
software go through all this trouble; we should just have MXR = 1
grant read permission to S/U level while executing in M mode. (Yes,
that sounds contradictory, but remember it's for when MPRV = 1.)
So the idea is to have MXR also work for S/U-mode-only PMP regions ? I see how that would help on a system without an MMU, it'll also be more consistent this way, but on a system with MMU I don't see why it makes sense to use PMP to mark a region as execute-only. It makes more sense to mark the region as R/W/X for S/U mode and leave it to the OS running on S mode to add further restrictions through the MMU, in which case MXR will work as-is.

I agree with this approach as long as we make sure that MXR can only be set to 1 when running on M-mode, on the current spec it's allowed to be set regardless of privilege mode as with MPRV. I remember there was a discussion regarding MPRV to mandate it can only be set when on M-mode but I just checked the latest draft and it hasn't changed. Leaving MPRV set will most probably result the sw to crash, but if MXR is left set outside M-mode we basically remove the protection on any execute-only memory region, especially if this also works for PMP regions.

Have in mind that this proposal is
meant to solve a specific problem related to a specific threat model,
it's not about changing PMP in general to do all sorts of stuff.
Before
we have something else I'd appreciate a threat model and a problem
description.
I believe we need to widen the scope of this proposal to cover other
cases. Sticking to the narrower scope you prefer would be fine except
for one thing: We know that handling these other cases is going to
also involve PMP, so there's an overlap there. If we don't try to
address all the demands on PMP together, we will end up with a
layering of modifications that, as Greg Favor has said, are not likely
to fit together as well.
I just want to be sure that we are after specific issues and -in case of security controls- have specific threat models in mind. Coming up with solutions without having discussed the problem first and proposing security controls without a threat model won't work.

Locking down mtvec may also be important, but since it doesn't involve
PMP, such other security features can be defined independently, as you
propose.

P.S. U-boot usually knows the executable regions of the kernel, first
because it needs to jump there, second because it's the one that put
the kernel there (and/or unpacked it). Unless we are talking about a
kernel that self-extracts or relocates itself, u-boot can set MML
before jumping to the kernel if needed (and there are no modules to
load).
As you put it yourself, "U-Boot usually knows the executable regions
of the kernel", except when it doesn't, because the kernel
self-extracts, or relocates itself, or has loadable modules. And yes,
if desired, an OS's loadable modules might be signed; I don't see why
not. I think we should want to cover as many use cases as we
reasonably can, as best as we can.
Still MML can always be set after decompression/relocation/loading modules, I don't see why it's an issue and I don't see what's the added security gain by allowing temporary M-mode-only regions (removable) during boot or in general, where anything running there can remove them in a few instructions. The threat model Tariq brought up was about detecting a glitch attack but the glitch can also happen when setting a rule in the first place, I don't see how this is the proper approach, PMP is not there as an anti-tampering mechanism. If we want this as a debug feature I'm ok with it but it must be treated as such and not as a security improvement. Same goes for the ability to let M-mode still execute any region not covered by PMP, or being able to register new executable regions for "greater flexibility". We first need to decide if there are valid use cases / threat models that need to be addressed and then talk about possible modifications to the proposal.

Regards,
Nick