Re: proposal for stateen CSRs
Bill Huffman
Hello John,
toggle quoted messageShow quoted text
Are some of the bits in xstateen allocated for custom use? Bill
-----Original Message-----
From: tech-privileged@... <tech-privileged@...> On Behalf Of John Hauser Sent: Tuesday, April 20, 2021 2:18 PM To: tech-privileged@... Subject: [RISC-V] [tech-privileged] proposal for stateen CSRs EXTERNAL MAIL Hello tech-privileged, The proposal below has been discussed by some of the principle RISC-V architects for incorporation into the official Privileged Architecture. The text below makes reference to Zfinx, Sstc, a QUERY instruction, and the RISC-V Advanced Interrupt Architecture, which are other extensions being separately developed. - John H. ---------------------------------------- The following is a proposal for an addition to the main Privileged Architecture (not a separately named extension). -------------------- Motivation Currently, the implementation of optional RISC-V extensions has the potential to open covert channels between separate user threads, or between separate guest OSes running under a hypervisor. The problem occurs when an extension adds processor state---usually explicit registers, but possibly other forms of state---that the main OS or hypervisor is unaware of (and hence won't context-switch) but that can be modified/written by one user thread or guest OS and perceived/ examined/read by another. Consider, for example, that the N extension is someday ratified by the RISC-V Association, and a hart implements both S mode and the N extension, with misa.N hardwired = 1. The OS in use on this hart might be oblivious to the N extension and hence might not test for the extension or pay any attention to the eight CSRs it adds to the ISA: ustatus, uie, utvec, uscratch, uepc, ucause, utval, and uip. In that case, most of these CSRs provide an obvious covert channel between user threads. Although traditional practices might consider such a communication channel harmless, the intense focus on security today argues that a means be offered to plug such channels. The F registers of the RISC-V floating-point extensions and the V registers of the vector extension would also be potential covert channels, except for the existence of the FS and VS fields in the sstatus register. An OS that is unaware of, say, the vector extension and its V registers will unwittingly prevent access to those registers by initializing unknown fields of sstatus to zeros, which in this case will include the VS field. Obviously, one way to prevent the use of the N extension's CSRs as a covert channel would be to add to sstatus an "NS" field for the N extension, paralleling the V extension's VS field. However, this is not considered a general solution to the problem due to the number of potential future extensions that may add small amounts of state. Even with a 64-bit sstatus (necessitating adding sstatush for RV32), it is not certain there are enough remaining bits in sstatus to accommodate all future extensions. In any event, there is no need to strain sstatus (and add sstatush) for this purpose. The "enable" flags that are needed to plug covert channels are not generally expected to require swapping on context switches of user threads, making them a less-than-compelling candidate for inclusion in sstatus. Hence, a new place is proposed for them instead. -------------------- Proposal RV64 harts that conform to the RISC-V Privileged Architecture may optionally implement four new 64-bit CSRs at machine level, listed with their CSR addresses: 0x30C mstateen0 (Machine State Enable 0) 0x30D mstateen1 0x30E mstateen2 0x30F mstateen3 If supervisor mode is implemented, another four CSRs would be defined at supervisor level: 0x10C sstateen0 0x10D sstateen1 0x10E sstateen2 0x10F sstateen3 And if the hypervisor extension is implemented, another set of CSRs is added: 0x60C hstateen0 0x60D hstateen1 0x60E hstateen2 0x60F hstateen3 If any "stateen" CSRs is implemented, they must all be implemented for the respective modes. For RV32, the registers listed above are 32-bit, and for the machine-level and hypervisor CSRs there is a corresponding set of high-half CSRs for the upper 32 bits of each register: 0x31C mstateen0h 0x31D mstateen1h 0x31E mstateen2h 0x31F mstateen3h 0x61C hstateen0h 0x61D hstateen1h 0x61E hstateen2h 0x61F hstateen3h For the sstateen registers, high-half CSRs are not added at this time because it is expected the upper 32 bits of these registers will always be zeros, as explained later below. The number of registers provided at each level is four because it is believed that 4 * 64 = 256 bits for machine and hypervisor levels, and 4 * 32 = 128 bits for supervisor level, will be adequate for many years to come, perhaps for as long as the RISC-V ISA is in use. The exact number four is an attempted compromise between providing too few bits on the one hand and going overboard with CSRs that will never be used on the other. A possible future doubling of the number of stateen CSRs is discussed later. The registers at each level control access to state at all lower privilege levels, but not at its own level. This is analogous to the function of the existing counteren CSRs. Each bit of an sstateen CSR controls user-level access (from U mode or VU mode) to an extension's state, for an extension that was not deemed "worthy" of a full XS field in sstatus like the FS and VS fields for the F and V extensions. The intention is to allocate the bits of sstateen CSRs starting at the least-significant end, bit 0, through to bit 31, and then on to the next-higher sstateen CSR. For every bit with a defined purpose in an sstateen CSR, the same bit is defined in the matching mstateen CSR to control access below machine level to the same state. The upper 32 bits of an mstateen CSR (or for RV32, the corresponding high-half CSR) control access to state that is inherently inaccessible to user level, so no corresponding enable bits in the supervisor-level sstateen CSR are applicable. The intention is to allocate bits for this purpose starting at the most-significant end, bit 63, through to bit 32, and then on to the next-higher mstateen CSR. If the rate that bits are being allocated from the least-significant end for sstateen CSRs is sufficiently low, allocation from the most- significant end of mstateen CSRs may be allowed to encroach on the lower 32 bits before jumping to the next-higher mstateen CSR. In that case, the bit positions of "encroaching" bits will remain forever read-only zeros in the matching sstateen CSRs. With the hypervisor extension, the hstateen CSRs have identical encoding to the mstateen CSRs, except controlling accesses for a virtual machine (from VS and VU modes). Bits in any stateen CSR that are defined to control state that a hart doesn't implement are read-only zeros for that hart. Likewise, all reserved bits not yet given a defined meaning are also read-only zeros. For every bit in an mstateen CSR that is zero (whether read-only zero or set to zero), the same bit appears as read-only zero in the matching hstateen and sstateen CSRs. For every bit in an hstateen CSR that is zero (whether read-only zero or set to zero), the same bit appears as read-only zero in sstateen when accessed from a virtual machine. On reset, all mstateen bits are initialized by the hardware to zeros. If machine-level software changes these values, it is responsible for initializing the matching hstateen and sstateen CSRs to zeros too. Software at each privilege level should set its respective stateen CSRs to indicate the state it is prepared to allow lower-privilege software to access. For OSes and hypervisors, this usually means the state that the OS or hypervisor is prepared to swap on a context switch, or to manage in some other way. Implementing the stateen CSRs is optional for harts (though platform standards can always make them mandatory). When the stateen CSRs are not implemented, all state added by an extension is accessible as defined by that extension. For each mstateen and hstateen CSR, bit 63 is defined to control access to the matching supervisor-level sstateen CSR. That is, bit 63 of mstateen0 and hstateen0 controls access to sstateen0; bit 63 of mstateen1 and hstateen1 controls access to sstateen1; etc. A hypervisor may need this control over accesses to the sstateen CSRs if it ever must emulate for a virtual machine an extension that is supposed to be affected by a bit in an sstateen CSR. (Even if such emulation is uncommon, it shouldn't be excluded.) Machine-level software needs identical control to be able to emulate the hypervisor extension. (That is, machine level needs control over accesses to the supervisor-level sstaten CSRs in order to emulate the hstateen CSRs, which have such control.) If the hypervisor extension is not implemented and a supervisor-level sstateen CSR is all read-only zeros, an implementation may make bit 63 of the matching mstateen read-only zero. In that case, machine-level software should preferably emulate attempts to access the affected sstateen CSR from S mode, ignoring writes and returning zero for reads. Initially, the following bits are proposed to be defined in sstaten0, mstateen0, and hstateen0: bit 0 QUERY data bit 1 fcsr for Zfinx and related extensions (Zdinx, etc.) bit 2 Tentatively reserved for the N extension As a special case, bit 0 is used to control access to the information returned by the optional QUERY instruction, even though this cannot act as a covert channel between user threads or guest OSes. For more, see the documentation for the QUERY instruction. Bit 1 applies only for the case when floating-point instructions operate on X registers instead of F registers. Whenever misa.F = 1, bit 1 of mstateen0 is read-only zero (and hence read-only zero in hstateen0 and sstateen0 too). For convenience, when the stateen CSRs are implemented and misa.F = 0, then if bit 1 of a relevant stateen0 CSR is zero, _all_ floating-point instructions cause an illegal instruction trap (or possibly a virtual instruction trap, if executed in a virtual machine), as though they all touch fcsr, regardless of whether they really do. In addition to the bits listed above for user-accessible state, the following are also proposed initially for mstateen0 and hstateen0: bit 61 Reserved for the RISC-V Advanced Interrupt Architecture bit 62 stimecmp, vstimecmp of Sstc extension bit 63 sstaten0 -------------------- Usage After the machine-level mstateen CSRs are all initialized to zeros on reset, machine-level software can set bits in these registers to enable lower-privilege access to the controlled state. This may be either because machine-level software knows how to swap the state or, more likely, because machine-level software isn't swapping supervisor- level environments. (Recall that the main reason the mstateen CSRs must exist is so machine level can emulate the hypervisor extension. When machine level isn't emulating the hypervisor extension, it is likely there will be no need to keep any mstateen bits zero.) If machine level sets any mstateen bits to nonzero, it must initialize the matching hstateen CSRs to zeros if the hypervisor extension is implemented. And if any mstateen bits that are set to one have matching bits in the sstateen CSRs, machine-level software must initialize those sstateen CSRs to zeros. Ordinarily, machine-level software will want to set bit 63 of each mstateen CSR, necessitating that it zero all hstateen and sstateen CSRs. An OS at supervisor level should see the sstateen CSRs initialized to zeros when the OS starts. It can set bits in these registers to enable user-level access to the controlled state, presumably because the OS knows how to context-swap the state. A hypervisor must include the sstateen CSRs in the context it swaps for a guest OS. When it starts a new guest OS, it must ensure the sstaten CSRs are initialized to zeros. In many cases, the various bits of the stateen CSRs will have a dual purpose as enables for the ISA extensions that introduce the controlled state. -------------------- Possible expansion If a need is anticipated, the set of stateen CSRs could in the future be doubled by adding these: 0x38C mstateen4 0x39C mstateen4h 0x38D mstateen5 0x39D mstateen5h 0x38E mstateen6 0x39E mstateen6h 0x38F mstateen7 0x39F mstateen7h 0x18C sstateen4 0x18D sstateen5 0x18E sstateen6 0x18F sstateen7 0x68C hstateen4 0x69C hstateen4h 0x68D hstateen5 0x69D hstateen5h 0x68E hstateen6 0x69E hstateen6h 0x68F hstateen7 0x69F hstateen7h These additional CSRs are not a definite part of the original proposal because it is unclear whether they will ever be needed, and it is believed the rate of consumption of bits in the first group, numbers 0-3, will be slow enough that any looming shortage will be perceptible many years in advance. At the moment, it is not known even how many years it may take to exhaust just mstateen0, sstateen0, and hstateen0.
|
|
Re: proposal for stateen CSRs
John Hauser
I wrote:
The proposal below has been discussed by some of the principle RISC-V"Principal". - John Hauser
|
|
proposal for stateen CSRs
John Hauser
Hello tech-privileged,
The proposal below has been discussed by some of the principle RISC-V architects for incorporation into the official Privileged Architecture. The text below makes reference to Zfinx, Sstc, a QUERY instruction, and the RISC-V Advanced Interrupt Architecture, which are other extensions being separately developed. - John H. ---------------------------------------- The following is a proposal for an addition to the main Privileged Architecture (not a separately named extension). -------------------- Motivation Currently, the implementation of optional RISC-V extensions has the potential to open covert channels between separate user threads, or between separate guest OSes running under a hypervisor. The problem occurs when an extension adds processor state---usually explicit registers, but possibly other forms of state---that the main OS or hypervisor is unaware of (and hence won't context-switch) but that can be modified/written by one user thread or guest OS and perceived/ examined/read by another. Consider, for example, that the N extension is someday ratified by the RISC-V Association, and a hart implements both S mode and the N extension, with misa.N hardwired = 1. The OS in use on this hart might be oblivious to the N extension and hence might not test for the extension or pay any attention to the eight CSRs it adds to the ISA: ustatus, uie, utvec, uscratch, uepc, ucause, utval, and uip. In that case, most of these CSRs provide an obvious covert channel between user threads. Although traditional practices might consider such a communication channel harmless, the intense focus on security today argues that a means be offered to plug such channels. The F registers of the RISC-V floating-point extensions and the V registers of the vector extension would also be potential covert channels, except for the existence of the FS and VS fields in the sstatus register. An OS that is unaware of, say, the vector extension and its V registers will unwittingly prevent access to those registers by initializing unknown fields of sstatus to zeros, which in this case will include the VS field. Obviously, one way to prevent the use of the N extension's CSRs as a covert channel would be to add to sstatus an "NS" field for the N extension, paralleling the V extension's VS field. However, this is not considered a general solution to the problem due to the number of potential future extensions that may add small amounts of state. Even with a 64-bit sstatus (necessitating adding sstatush for RV32), it is not certain there are enough remaining bits in sstatus to accommodate all future extensions. In any event, there is no need to strain sstatus (and add sstatush) for this purpose. The "enable" flags that are needed to plug covert channels are not generally expected to require swapping on context switches of user threads, making them a less-than-compelling candidate for inclusion in sstatus. Hence, a new place is proposed for them instead. -------------------- Proposal RV64 harts that conform to the RISC-V Privileged Architecture may optionally implement four new 64-bit CSRs at machine level, listed with their CSR addresses: 0x30C mstateen0 (Machine State Enable 0) 0x30D mstateen1 0x30E mstateen2 0x30F mstateen3 If supervisor mode is implemented, another four CSRs would be defined at supervisor level: 0x10C sstateen0 0x10D sstateen1 0x10E sstateen2 0x10F sstateen3 And if the hypervisor extension is implemented, another set of CSRs is added: 0x60C hstateen0 0x60D hstateen1 0x60E hstateen2 0x60F hstateen3 If any "stateen" CSRs is implemented, they must all be implemented for the respective modes. For RV32, the registers listed above are 32-bit, and for the machine-level and hypervisor CSRs there is a corresponding set of high-half CSRs for the upper 32 bits of each register: 0x31C mstateen0h 0x31D mstateen1h 0x31E mstateen2h 0x31F mstateen3h 0x61C hstateen0h 0x61D hstateen1h 0x61E hstateen2h 0x61F hstateen3h For the sstateen registers, high-half CSRs are not added at this time because it is expected the upper 32 bits of these registers will always be zeros, as explained later below. The number of registers provided at each level is four because it is believed that 4 * 64 = 256 bits for machine and hypervisor levels, and 4 * 32 = 128 bits for supervisor level, will be adequate for many years to come, perhaps for as long as the RISC-V ISA is in use. The exact number four is an attempted compromise between providing too few bits on the one hand and going overboard with CSRs that will never be used on the other. A possible future doubling of the number of stateen CSRs is discussed later. The registers at each level control access to state at all lower privilege levels, but not at its own level. This is analogous to the function of the existing counteren CSRs. Each bit of an sstateen CSR controls user-level access (from U mode or VU mode) to an extension's state, for an extension that was not deemed "worthy" of a full XS field in sstatus like the FS and VS fields for the F and V extensions. The intention is to allocate the bits of sstateen CSRs starting at the least-significant end, bit 0, through to bit 31, and then on to the next-higher sstateen CSR. For every bit with a defined purpose in an sstateen CSR, the same bit is defined in the matching mstateen CSR to control access below machine level to the same state. The upper 32 bits of an mstateen CSR (or for RV32, the corresponding high-half CSR) control access to state that is inherently inaccessible to user level, so no corresponding enable bits in the supervisor-level sstateen CSR are applicable. The intention is to allocate bits for this purpose starting at the most-significant end, bit 63, through to bit 32, and then on to the next-higher mstateen CSR. If the rate that bits are being allocated from the least-significant end for sstateen CSRs is sufficiently low, allocation from the most- significant end of mstateen CSRs may be allowed to encroach on the lower 32 bits before jumping to the next-higher mstateen CSR. In that case, the bit positions of "encroaching" bits will remain forever read-only zeros in the matching sstateen CSRs. With the hypervisor extension, the hstateen CSRs have identical encoding to the mstateen CSRs, except controlling accesses for a virtual machine (from VS and VU modes). Bits in any stateen CSR that are defined to control state that a hart doesn't implement are read-only zeros for that hart. Likewise, all reserved bits not yet given a defined meaning are also read-only zeros. For every bit in an mstateen CSR that is zero (whether read-only zero or set to zero), the same bit appears as read-only zero in the matching hstateen and sstateen CSRs. For every bit in an hstateen CSR that is zero (whether read-only zero or set to zero), the same bit appears as read-only zero in sstateen when accessed from a virtual machine. On reset, all mstateen bits are initialized by the hardware to zeros. If machine-level software changes these values, it is responsible for initializing the matching hstateen and sstateen CSRs to zeros too. Software at each privilege level should set its respective stateen CSRs to indicate the state it is prepared to allow lower-privilege software to access. For OSes and hypervisors, this usually means the state that the OS or hypervisor is prepared to swap on a context switch, or to manage in some other way. Implementing the stateen CSRs is optional for harts (though platform standards can always make them mandatory). When the stateen CSRs are not implemented, all state added by an extension is accessible as defined by that extension. For each mstateen and hstateen CSR, bit 63 is defined to control access to the matching supervisor-level sstateen CSR. That is, bit 63 of mstateen0 and hstateen0 controls access to sstateen0; bit 63 of mstateen1 and hstateen1 controls access to sstateen1; etc. A hypervisor may need this control over accesses to the sstateen CSRs if it ever must emulate for a virtual machine an extension that is supposed to be affected by a bit in an sstateen CSR. (Even if such emulation is uncommon, it shouldn't be excluded.) Machine-level software needs identical control to be able to emulate the hypervisor extension. (That is, machine level needs control over accesses to the supervisor-level sstaten CSRs in order to emulate the hstateen CSRs, which have such control.) If the hypervisor extension is not implemented and a supervisor-level sstateen CSR is all read-only zeros, an implementation may make bit 63 of the matching mstateen read-only zero. In that case, machine-level software should preferably emulate attempts to access the affected sstateen CSR from S mode, ignoring writes and returning zero for reads. Initially, the following bits are proposed to be defined in sstaten0, mstateen0, and hstateen0: bit 0 QUERY data bit 1 fcsr for Zfinx and related extensions (Zdinx, etc.) bit 2 Tentatively reserved for the N extension As a special case, bit 0 is used to control access to the information returned by the optional QUERY instruction, even though this cannot act as a covert channel between user threads or guest OSes. For more, see the documentation for the QUERY instruction. Bit 1 applies only for the case when floating-point instructions operate on X registers instead of F registers. Whenever misa.F = 1, bit 1 of mstateen0 is read-only zero (and hence read-only zero in hstateen0 and sstateen0 too). For convenience, when the stateen CSRs are implemented and misa.F = 0, then if bit 1 of a relevant stateen0 CSR is zero, _all_ floating-point instructions cause an illegal instruction trap (or possibly a virtual instruction trap, if executed in a virtual machine), as though they all touch fcsr, regardless of whether they really do. In addition to the bits listed above for user-accessible state, the following are also proposed initially for mstateen0 and hstateen0: bit 61 Reserved for the RISC-V Advanced Interrupt Architecture bit 62 stimecmp, vstimecmp of Sstc extension bit 63 sstaten0 -------------------- Usage After the machine-level mstateen CSRs are all initialized to zeros on reset, machine-level software can set bits in these registers to enable lower-privilege access to the controlled state. This may be either because machine-level software knows how to swap the state or, more likely, because machine-level software isn't swapping supervisor- level environments. (Recall that the main reason the mstateen CSRs must exist is so machine level can emulate the hypervisor extension. When machine level isn't emulating the hypervisor extension, it is likely there will be no need to keep any mstateen bits zero.) If machine level sets any mstateen bits to nonzero, it must initialize the matching hstateen CSRs to zeros if the hypervisor extension is implemented. And if any mstateen bits that are set to one have matching bits in the sstateen CSRs, machine-level software must initialize those sstateen CSRs to zeros. Ordinarily, machine-level software will want to set bit 63 of each mstateen CSR, necessitating that it zero all hstateen and sstateen CSRs. An OS at supervisor level should see the sstateen CSRs initialized to zeros when the OS starts. It can set bits in these registers to enable user-level access to the controlled state, presumably because the OS knows how to context-swap the state. A hypervisor must include the sstateen CSRs in the context it swaps for a guest OS. When it starts a new guest OS, it must ensure the sstaten CSRs are initialized to zeros. In many cases, the various bits of the stateen CSRs will have a dual purpose as enables for the ISA extensions that introduce the controlled state. -------------------- Possible expansion If a need is anticipated, the set of stateen CSRs could in the future be doubled by adding these: 0x38C mstateen4 0x39C mstateen4h 0x38D mstateen5 0x39D mstateen5h 0x38E mstateen6 0x39E mstateen6h 0x38F mstateen7 0x39F mstateen7h 0x18C sstateen4 0x18D sstateen5 0x18E sstateen6 0x18F sstateen7 0x68C hstateen4 0x69C hstateen4h 0x68D hstateen5 0x69D hstateen5h 0x68E hstateen6 0x69E hstateen6h 0x68F hstateen7 0x69F hstateen7h These additional CSRs are not a definite part of the original proposal because it is unclear whether they will ever be needed, and it is believed the rate of consumption of bits in the first group, numbers 0-3, will be slow enough that any looming shortage will be perceptible many years in advance. At the moment, it is not known even how many years it may take to exhaust just mstateen0, sstateen0, and hstateen0.
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Most hypervisor will always have G-stage (Stage2) programmed for each Guest/VM irrespective whether VS-mode MMU is on or off. This means the G-stage level protection is always there for Guest/VM. When a Guest/VM does invalid physical address access, the hypervisor catches it as trap. Most hypervisor will always stop/dump the Guest/VM when it misbehaves and possibly even report the event to some cloud management software. I don’t see how sPMP fits in virtualization scenario considering a G-stage is already available. Things will get even more complicated for nested virtualization if a hypervisor chooses to expose sPMP to the Guest hypervisor.
The only utility of sPMP seems to be letting HS-mode handle invalid access trap directly rather than being routed through M-mode firmware.
I think it is best to just re-purpose page fault exception codes for sPMP whenever sPMP is available.
Regards, Anup
From: tech-privileged@... <tech-privileged@...>
On Behalf Of Dong Du
Yes, it's reasonable as both sPMP and paging provides memory protection that is managed by the S-mode.
Following the idea, suggestions by Bill and Jonathan to rename exception codes 12, 13, and 15 to "page fault / sPMP fault" or "S-mode access fault" do make sense.
All the best, Dong
------------------ Original ------------------ Date: Tue, Apr 20, 2021 09:06 AM To: "Bill Huffman"<huffman@...>; Cc: "Paolo Bonzini"<pbonzini@...>; "tech-privileged"<tech-privileged@...>; Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP
Στις 2021-04-20 00:37, Bill Huffman έγραψε:
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Dong Du
Yes, it's reasonable as both sPMP and paging provides memory protection that is managed by the S-mode. Following the idea, suggestions by Bill and Jonathan to rename exception codes 12, 13, and 15 to "page fault / sPMP fault" or "S-mode access fault" do make sense. All the best, Dong ------------------ Original ------------------ Date: Tue, Apr 20, 2021 09:06 AM To: "Bill Huffman"<huffman@...>; Cc: "Paolo Bonzini"<pbonzini@...>; "tech-privileged"<tech-privileged@...>; Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP Στις 2021-04-20 00:37, Bill Huffman έγραψε: > Perhaps we should rename the sPMP to something that sounds more like > the simplified page table idea. Maybe Memory Protection Unit. The > fact that it resembles PMP doesn't have to appear in the name. > Good point, if we end up treating it differently than PMA/PMP it makes sense to name it differently as well. Regards, Nick
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Nick Kossifidis
Στις 2021-04-20 00:37, Bill Huffman έγραψε:
Perhaps we should rename the sPMP to something that sounds more likeGood point, if we end up treating it differently than PMA/PMP it makes sense to name it differently as well. Regards, Nick
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Nick Kossifidis
Στις 2021-04-20 03:31, Jonathan Behrens έγραψε:
On Mon, Apr 19, 2021 at 8:21 PM Nick Kossifidis via lists.riscv.org <mick=ics.forth.gr@...> wrote:Ouch ! good point there, that's definitely a scenario that doesn't make sense.Στις 2021-04-19 22:47, Jonathan Behrens έγραψε:It for instance rules out the case where M-mode uses PMP, HS-mode uses sPMP and paging, and VS-mode also uses sPMP and paging. That would be 5 stages of protection for every access! (Which is way worse than it sounds because each level of walking a page table also has to be validated)How about sPMP is only used if satp.MODE=BARE or virtualization isI'd be ok with this, but I don't see what we gain from it, in the end Regards, Nick
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Nick Kossifidis
Στις 2021-04-20 03:14, Andrew Waterman έγραψε:
On Mon, Apr 19, 2021 at 5:05 PM Nick Kossifidis <mick@...> wrote:Indeed we can just use the same code path as access-faults and rename the exception as Bill suggested, my question is why not just raise access-faults from sPMP like we do for PMA/PMP, in which case we don't need to change anything in the spec / code. Why do we need to treat sPMP differently than PMP or PMA ? The answer I have so far is to avoid going through M-mode and use medeleg instead, is this enough to justify such a change ?Στις 2021-04-19 23:17, Bill Huffman έγραψε:That sounds like an arbitrary constraint based upon how our arch-specific Linux code is currently written. There's nothing wrong with changing our arch-specific Linux code to be sPMP-aware in the no-MMU case. That doesn't mean we need to compile in the do_page_fault routine and all of the MMU code that it depends on; we'd instead map the page-fault exceptions to a different subroutine for the no-MMU case.Nick,We currently get access-fault exceptions in case anyone tries to access Regards, Nick
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Jonathan Behrens <behrensj@...>
Στις 2021-04-19 22:47, Jonathan Behrens έγραψε: It for instance rules out the case where M-mode uses PMP, HS-mode uses sPMP and paging, and VS-mode also uses sPMP and paging. That would be 5 stages of protection for every access! (Which is way worse than it sounds because each level of walking a page table also has to be validated) Jonathan
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Nick Kossifidis
Στις 2021-04-19 22:47, Jonathan Behrens έγραψε:
How about sPMP is only used if satp.MODE=BARE or virtualization is enabled and hgatp.MODE=BARE? That would enable the trusted hypervisor case, while disallowing an S-mode operating system from enabling both paging and sPMP at the same time.I'd be ok with this, but I don't see what we gain from it, in the end the hw will still use sPMP + MMU in case the hypervisor uses sPMP and the guest uses paging, because we still want sPMP to be active when guest is running. Regards, Nick
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Jonathan Behrens <behrensj@...>
It seems like we could just rename exception codes 12, 13, and 15 to "page fault / sPMP fault" and be done with it. Jonathan
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Nick Kossifidis
Στις 2021-04-19 23:01, Paolo Bonzini έγραψε:
On 19/04/21 21:23, Nick Kossifidis wrote:It's possible for an implementation to only support one-stage translation and hardwire hgatp to zero, in which case it makes sense for the hypervisor to have sPMP available.A scenario we discussed at some point was a trusted hypervisor running on HS mode, with e.g. Linux and a trusted service running on VS mode. The trusted hypervisor is usually very small/simple and may not use paging, so hgatp will be set to bare and it'll fallback to PMP/ePMP as the current hypervisor spec mandates. With sPMP the hypervisor will be able to configure its own regions and also isolate Linux from the trusted serviceI don't see much difference between setting up sPMP and a constant Regards, Nick
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Andrew Waterman
On Mon, Apr 19, 2021 at 5:05 PM Nick Kossifidis <mick@...> wrote: Στις 2021-04-19 23:17, Bill Huffman έγραψε: That sounds like an arbitrary constraint based upon how our arch-specific Linux code is currently written. There's nothing wrong with changing our arch-specific Linux code to be sPMP-aware in the no-MMU case. That doesn't mean we need to compile in the do_page_fault routine and all of the MMU code that it depends on; we'd instead map the page-fault exceptions to a different subroutine for the no-MMU case. That's why we talked about using another cause value
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Nick Kossifidis
Στις 2021-04-19 23:17, Bill Huffman έγραψε:
Nick,We currently get access-fault exceptions in case anyone tries to access a region protected with PMA or PMP, regardless of privilege mode. All PMA and PMP faults from any mode are already routed to the same exception codes anyway. Those exceptions are typically handled by M-mode, and based on mstatus.MPP, firmware may forward them accordingly. That's what OpenSBI currently does -> https://github.com/riscv/opensbi/blob/master/lib/sbi/sbi_trap.c#L260, and here is Linux triggering a segfault in sw upon receiving an access-fault exception -> https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/traps.c?h=fixes#n102 without going though do_page_fault. I don't see why sPMP faults should be treated any differently, we can always get an access fault with mstatus.MPP = S/U from sPMP on M-mode and forward it to S-mode to handle it, like we forward PMA or PMP faults originating from S/U-mode. The requirement set by the proposal's authors was to use medeleg to delegate sPMP faults to S-mode directly as page-faults, without going through M-mode's trap handler as PMA/PMP access-faults do. Without MMU support on the S-mode OS this won't work, on Linux for example do_page_fault is only available when MMU support is enabled -> https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/mm/Makefile?h=fixes#n13, https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/entry.S?h=fixes#n449. So to be clear, treating sPMP faults as page-faults requires the OS to support MMU and expect page-faults, it's only possible when both sPMP and MMU are used at the same time, and the scenario I mentioned previously is also not covered by this approach (since the hypervisor won't use the MMU). That's why we talked about using another cause value for sPMP. If we drop the requirement to delegate sPMP faults through medeleg, we can just treat them as access faults and be done with it, we won't need to change any code as well, access-faults are expected by the OS in any case. Regards, Nick
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Allen Baum
Relevant word here is "ranges" - PMPs (can) have ranges, explicitly defined in CSRs and don't modify the generated address - all they do is check permissions. MMUs have 4K pages (or superpages) defined in a memory structure (but can be cached in a TLB). IF you need arbitrary sized ranges, that gets expensive in terms of the # of entries needed They modify the generated address, and must walk down levels of page tables if not found in the TLB, keep track of A & D bits (if you're using std RISC-V PTEs). That means an extra request port to memory. They certainly seem harder to build to me. from a logical standpoint. I can't comment on the physical design effort.
On Mon, Apr 19, 2021 at 3:15 PM Bill Huffman <huffman@...> wrote: They're not hard to build. The point is eliminating the miss time, reducing hardware cost, and improving the granularity issues by having a few address ranges controllable by S-mode.
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Greg Favor
On Mon, Apr 19, 2021 at 2:37 PM Bill Huffman <huffman@...> wrote: Perhaps we should rename the sPMP to something that sounds more like the simplified page table idea. Maybe Memory Protection Unit. The fact that it resembles PMP doesn't have to appear in the name. Given the use cases for "sPMP" broadly, and for use of a combination of "sPMP" and paging (in a virtualized setting), the MPU label provides a more appropriate conceptualization of what this proposal is about. And, as such, it would be appropriate for the MPU to re-use the page fault exception codes for its more block/region oriented faults. In essence the "page fault" exception codes are supervisor-level MMU exception codes that would be appropriate to use for both page-based and block/region-based MMUs. They are both forms of MMUs. Just think of "page fault" exception codes more generally as "Supervisor MMU fault" exception codes. Btw, if this proposal is going to actually support HS/U-mode MPU usage in conjunction with VS/VU-mode paged-based MMU usage, then the interaction details (when in VS/VU modes) need to be worked out and made part of the proposed spec. And ditto for what is loaded into each of the exception-related CSRs on an MPU fault while V=1 and while V=0. This probably also means that an operating environment with a hypervisor using an MPU cannot itself be virtualized (unless the higher-level hypervisor starts emulating a guest MPU using G-stage page tables or a host MPU?). Greg
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Bill Huffman
They're not hard to build. The point is eliminating the miss time, reducing hardware cost, and improving the granularity issues by having a few address ranges controllable by S-mode.
toggle quoted messageShow quoted text
-----Original Message-----
From: Paolo Bonzini <pbonzini@...> Sent: Monday, April 19, 2021 6:11 PM To: Bill Huffman <huffman@...>; tech-privileged@... Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP EXTERNAL MAIL On 19/04/21 23:37, Bill Huffman wrote: Perhaps we should rename the sPMP to something that sounds more likeOr just use page tables. I am not convinced that they're so hard to build. Paolo
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Paolo Bonzini
On 19/04/21 23:37, Bill Huffman wrote:
Perhaps we should rename the sPMP to something that sounds more likeOr just use page tables. I am not convinced that they're so hard to build. Paolo
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Bill Huffman
Perhaps we should rename the sPMP to something that sounds more like the simplified page table idea. Maybe Memory Protection Unit. The fact that it resembles PMP doesn't have to appear in the name.
toggle quoted messageShow quoted text
Bill
-----Original Message-----
From: tech-privileged@... <tech-privileged@...> On Behalf Of Paolo Bonzini Sent: Monday, April 19, 2021 5:28 PM To: tech-privileged@... Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP EXTERNAL MAIL On 19/04/21 22:17, Bill Huffman wrote: Nick,It really seems to me that the sPMP is not really a PMP but a simplified page table format (which vaguely reminds me of PowerPC BAT registers)... Paolo
|
|
Re: [RISC-V] [tech-tee] The proposal of sPMP
Paolo Bonzini
On 19/04/21 22:17, Bill Huffman wrote:
Nick,It really seems to me that the sPMP is not really a PMP but a simplified page table format (which vaguely reminds me of PowerPC BAT registers)... Paolo
|
|