Huawei review of different PMP enhancement schemes
Mr Tariq Kurd <tariq.kurd@...>
Hi everyone,
We have spent a considerable amount of time reviewing the different proposals and have come to some conclusions.
1. The PMP enhancement proposal can meet our needs with the following modifications - DMC (default memory closed) - DPL (delay PMP locking) - Shared executable regions --> This feature is missing from all the proposals except for the separate M&L proposal,
I've attached another modified version of the PMP enhancement proposal, this time including shared X/X RX/X regions in the reserved programming encodings (and changed the name of the document to include “shared X”)
We want this to save code size, so we can share (e.g.) the C runtime library between the OS and application code.
Allen - I think that the "any region locked" proposal doesn't work for us, because we want to be able to remove access to any unmapped memory without locking a region. Sorry, it’s a nice idea.
Joe - The suggestion of running code in U-mode during the boot process to avoid locking regions when MML=1 is interesting, and certainly possible. However it's a lot more work to set up a handler and system calls back to the handler - it's like a light OS just for the boot process. DPL to delay the locking is a much simpler solution.
2. John Hauser's proposal is better than (1) - the programming model is simpler - it's hard to get my head around the state changes when setting MML=1, but it's easier to follow John's - the modes cover what we need to do --> Shared executable regions are still missing - I think we also need to use the W and WX permissions as the PMP enhancement proposal does (it’s not clear if these are reserved or not – they are reserved in the spreadsheet but not in the document).
When MSL = 1, 2 or 3 and RWX=W -> shared region with X permission When MSL = 1, 2 or 3 and RWX=WX -> shared region with M-mode RX and S/U-mode X permission
This approach gives us the choice of whether to lock the regions or not, which is good although we think that locked shared X regions are probably sufficient.
3. Separate M&L doesn't offer many benefits - it was a simple proposal to see what would happen by separating out the bits. - it naturally has shared executable regions
In summary
- PMP enhancement + DPL + DMC + shared X is ok for us - John’s 4-level scheme + shared X is better for us
Tariq
PS if we do develop John proposal further then we think there are some corner cases to resolve - we think that the programming model is a bit strange because some of the configuration bits become read-only at different times, which is not good for software which will try to program a configuration and then quietly get a different configuration
The two obvious examples are - trying to program PL=1 or 3 when MSL=0 - trying to program an X region when MSL=3 I think in these cases PL[0] is ignored and the X bit is ignored respectively when writing the configuration. We would prefer to have an exception to say that the software programming the PMP is clearly wrong, to give us a chance to debug it.
-----Original Message-----
From: tech-tee@... [mailto:tech-tee@...] On Behalf Of John Hauser Sent: 22 February 2020 01:19 To: tech-tee@...; tech-privileged@... Subject: Re: [RISC-V] [tech-tee] comments on PMP enhancements
Hi Allen,
The point I was trying to make about locked PMP entries, but failed to communicate before, is this:
When a system starts up after reset, PMP is enforced according to a certain set of rules. For two of the proposals, the initial rules (MML = 0 or MSL = 0) are the same as currently standardized. For Mr. Kurd's separate M & L proposal, when MML = 0, entries created with M = 0 follow the currently standardized rules as they should.
After reset, an early stage of boot software might create a locked PMP entry under the initial PMP rules, intending to restrict M-mode's access to a certain region of memory. The early boot stage might be oblivious to the PMP enhancements or (perhaps more likely) it might want to be compatible with later boot stages that may or may not be aware of the PMP enhancements.
If a later stage then raises the level of PMP security by setting MML = 1 or MSL > 0, existing PMP entries are now enforced according to a new set of rules. For two of the proposals, the new rule for an existing locked entry is that S/U mode has no access and M mode has restricted access. But for Mr. Kurd's separate M & L proposal, the new rule is that S/U mode gets restricted access and M mode gets _no_ access. This is what I find questionable. The original intention under the initial rules was almost surely to give M mode restricted access, which has now been revoked. And the worst part is, although M mode itself no longer has access, it cannot deny access to S/U mode, because the entry is locked.
Here's what happens, step by step:
1. Machine starts from reset.
2. Early boot stage creates a locked PMP entry under initial PMP rules, intending to restrict M mode's access to memory.
3. Subsequent boot stage enables PMP enhancement, by setting MML = 1 (or MSL > 0).
Under the separate M & L proposal, at this point M mode has _no_ access to the memory covered by the PMP entry created in step 2, but S/U mode _does_ have access. And this situation is unchangeable because the entry is locked.
Under the task group's working proposal or my four-security-level proposal, M mode still has restricted access. It is S/U mode that loses access. I argue that this is better, and the separate M & L proposal is "wrong" on this point.
On a different topic, I wrote: > Mr. Kurd's proposal does not similarly lock down all of M mode's > executable regions against modification nor prevent the creation of > new PMP entries for executable regions. That is the shortcoming to > which I was referring.
You: > Well, that would be a pretty simple modification of his proposal, then: > disallow creation of new Mode X regions when MML=1. > If that modification was made, what drawbacks remain?
Preventing the creation of new regions with execute permission is only one piece of the lockdown of M-mode executable regions. The other is to lock all existing PMP entries that give M mode execute permission, so their address ranges can't be modified. The task group's working proposal does that by locking all M-mode-only PMP entries, something I replicated in my topmost security level, MSL = 3. Currently, the separate M & L proposal doesn't do this piece either.
Here's what I wrote before about both shortcomings I identified in the separate M & L proposal:
> I believe both of these flaws can be fixed, but only at the expense of > the simple separation of the L and M bits. In fact, you start to get > a design that looks more like my proposal.
And that's what I want to emphasize: Why patch a proposal with several tweaks when, as I see it, there is another, cleaner proposal on hand that already covers all the same needs?
You: > I'd still like to simplify either proposal by removing DMC and > replacing it with "any entry locked". > Once you've locked a region, there is no need to access unmapped > regions, since you can do that from the locked region
That's conceivable, but has the disadvantage of sometimes requiring one or more additional PMP entries. Thus, your desire to save one flip-flop and a few gates for DMC could sometimes cost 40 or 50 flip-flops and corresponding logic to have at least one more PMP entry. My own assessment is that, in any RISC-V core that implements standard PMP, that one flip-flop and small bit of logic is not worth fretting over.
Regards,
- John Hauser
|
|
John Hauser
Tariq Kurd wrote:
2. John Hauser's proposal is better than (1)Hi Tariq, I meant for the encodings with W = 1 and R = 0 to continue to be reserved, as the spreadsheet indicates, but you're right that I forgot to say so in my document. I have a different suggestion for adding support for shared executable regions. I've attached a new spreadsheet file with a new tab showing my upgraded proposal, version 0.3. You can easily see the changes by switching between the tabs labeled "4level 0.2" and "4level 0.3". In my newest version, PMP entries with PL = 0, W = 0, and X = 1 (executable but not writable in S/U mode) are now readable/executable in M mode. Previously all entries with PL = 0 were readable/writable for M mode. When MSL = 3, all PMP entries that give execute permission to M mode are locked. And new entries that would give execute permission to M mode cannot be created when MSL = 3, the same as before. I haven't updated my document yet, but I thought we could debate my version 0.3 just from the spreadsheet. I look forward to all feedback. - John Hauser |
|
Just as I have been asking why DMC is necessary, I have to ask why the DPL bit is necessary. If there is code that wants to reorder PMP entries while DPL is 1, but the lock bits are set - why don't you instead simply not program any lock bits until you get to the point that you would have changed DPL from 1->0? As the doc mentions: It is noted
that this style of boot flow does not prevent the PMP being unlocked again by
software, and so the security is lower than if the regions remain locked. If you are executing code that has not been authenticated while existing entries are unlocked (or the L bit is set but hasn't taken effect) - then you have a security issue. The DPL bit doesn't fix that, therefore it seems to me that the sequence above (separate "lock everything that needs locking" phase) gives you equivalent security. Also note that DPL is really two bits when implemented, since it as 3 states (initially 0, has been set to 1, has transitioned to 0 and is now locked). Can someone show a sequence that has higher security with DPL compared to a sequence that sets all the lock bits at the point that DPL would have been cleared? Ditto for DMC: can someone show a sequence (and memory map) that causes an extra entry to be required if the default memory closed is defined as "any entry is locked". If someone doesn't demonstrate one (that can't be easily modified to avoid the problem with equivalent security), I can't support either. On Mon, Feb 24, 2020 at 2:09 PM John Hauser <jh.riscv@...> wrote: Tariq Kurd wrote: |
|
mick@...
Some thoughts on the various proposals on the spreadsheet (v0.3):
M&L proposal: The purpose of M bit is not clear, I get that the idea is to be able to mark a rule that applies to M mode without having that rule also locked, but for example the combination L,M = 1,0 when MML = 0 doesn't follow that principle, it marks a rule as locked and also as enforced on M-mode even though M = 0. When MML = 1 we get unlocked M-mode-only regions when L,M = 0,1 but we also get locked S/U-only-regions when L,M = 1,0 which doesn't make much sense (I think John also brought this up). 4level0.2: With MSL=0 we get the current PMP behavior and with MSL=3 we get mostly the same behavior as with MML=1 on the group's proposal, only with one extra bit being used on pmpcfg and a redundant encoding (PL=2 and PL=3 are the same thing). Also it's possible to have a shared region that's executable by S/U mode and RW by M mode, which is not possible with the group's proposal as is. With MSL=2 we get rid of the restriction of not being able to add new executable M-mode-only regions, however that can be achieved by using non-locked M-mode-only regions that are also available on MSL=2 (with PL=3) since there is no such restriction defined for them. In other words non-locked M-mode-only regions allow for this restriction to be bypassed anyway. With MSL=1 we get rid of the restriction of not allowing M-mode to execute a region without a matching rule. However both locked and non-locked M-mode-only regions allow for this restriction to be bypassed on MSL=2 anyway since M-mode can just add such a rule and execute the region, it's even worse with non-locked rules since afterwards M-mode can also remove the rule and no one will ever know it happened. So to me MSL=1 is redundant, I don't see any use for it. It's also obviously redundant when DMC=1 but I'll come back to DMC later on. So basically the extras we get are: a) It's possible to have a region that's executable by S/U and RW by M-mode for MSL > 0 b) It's possible to have removable M-mode-only rules when 0 < MSL < 3 4level0.3: This is dangerous ! With this revision it's possible to have a region that's rw by S/U mode and executable by M mode when PL=0, which allows for an attacker to perform the attack described on the group's proposal and is exactly what we are trying to prevent. This is possible on all security levels by the way, even with MSL=3. It's also more complicated since PL=0 on MSL=3 encodes both locked and non-locked rules. Finally when MSL=3 and PL=3 we get removable M-mode-only, non-executable regions, at the highest security level. In terms of security it's a regression over revision 0.2, not an improvement. Regarding DMC: As shown above, restricting M-mode from executing memory regions without a matching rule, only makes sense if it's not possible to add such a rule (that allows execution). If it's possible to add a rule that applies to M-mode then any restrictions regarding regions without a matching rule, are a few instructions away from being bypassed. Same applies when restricting r/w/x on M-mode with the DMC bit. In both proposals DMC can be easily bypassed. Even if we incorporate DMC on the group's proposal we 'll still be able to add a rule that gives r/w privileges on M-mode, although this rule will be a locked one so it'll at least be possible to detect this event. However DMC to me is orthogonal to the various scenarios we discuss and given that it's possible to reset the hart with a pre-defined set of PMP rules, it makes sense to have such a mechanism. That's why my initial reaction to Tariq's proposal regarding DMC, was to propose to him to submit a separate proposal for this. What we discussed on this week's TEE TG call: a) Incorporate mseccfg.DMC to the group's proposal. It'll be a sticky bit so when it gets set it can only be unset through hard-reset. b) Allow for M-mode-only rules to be removable temporarily for debugging / flexibility purposes during boot (since this approach weakens PMP it can't be defined as a security feature), with a big disclaimer/warning in place, through the proposed DPL bit on mseccfg. This is also going to be defined as an optional feature. c) Add another bit for locking DPL, it'll only be possible to lock DPL to 0 (disabled). d) Use the remaining 2 encodings L=1,R=0,W=1,X=0 and L=1,R=0,W=1,X=1 when MML=1 to define a locked shared region that's executable by both M and S/U mode but not writable by anyone (when X is set it's also readable by M-mode), as Tariq proposed. The use case for this is to share code between M-mode and S/U-mode, e.g. to support vendor-specific extensions with custom assembly, without having to go through an ecall (similar to Linux's VDSO). e) Get rid of the security exception, use normal access faults instead. S/U mode can use SBI to request more info from M-mode if needed (since S/U can't access PMP registers to figure it out). Regards, Nick |
|
John Hauser
Nick Kossifidis wrote:
4level0.3:I agree that would be dangerous, but I intentionally excluded that possibility, so I don't understand. What is the exact encoding that you think allows this, when MSL > 0? FinallyThat detail could easily be changed, if that's the only remaining complaint about the security. - John Hauser |
|
Jonathan Behrens <behrensj@...>
John Hauser wrote: Nick Kossifidis wrote: I don't understand how having extra bit patterns for the PMP config registers compromise security. Isn't it pretty much a given that the values loaded into the PMP address registers and PMP config registers (and all other security relevant CSRs: mtvec, satp, mideleg, etc.) must be correct? If having a "M-mode-only, non-executable region" doesn't match your security goals, then don't program one? Nick Kossifidis wrote:
The restriction still makes sense as a form of defense in depth. Plus, "a few instructions" at elevated privilege is a rather high bar. That is all it takes to escape from a Javascript sandbox, to escalate from user mode to kernel mode, or to break out of a VM. Yet, in all of those isolation mechanisms provide a very real security because even the bugs they do have still leave it rather hard to execute specific desired instructions. Jonathan |
|