Date   

Re: proposal for stateen CSRs

Bill Huffman
 

Hello John,

Are some of the bits in xstateen allocated for custom use?

Bill

-----Original Message-----
From: tech-privileged@... <tech-privileged@...> On Behalf Of John Hauser
Sent: Tuesday, April 20, 2021 2:18 PM
To: tech-privileged@...
Subject: [RISC-V] [tech-privileged] proposal for stateen CSRs

EXTERNAL MAIL


Hello tech-privileged,

The proposal below has been discussed by some of the principle RISC-V architects for incorporation into the official Privileged Architecture.

The text below makes reference to Zfinx, Sstc, a QUERY instruction, and the RISC-V Advanced Interrupt Architecture, which are other extensions being separately developed.

- John H.


----------------------------------------

The following is a proposal for an addition to the main Privileged Architecture (not a separately named extension).

--------------------
Motivation

Currently, the implementation of optional RISC-V extensions has the potential to open covert channels between separate user threads, or between separate guest OSes running under a hypervisor. The problem occurs when an extension adds processor state---usually explicit registers, but possibly other forms of state---that the main OS or hypervisor is unaware of (and hence won't context-switch) but that can be modified/written by one user thread or guest OS and perceived/ examined/read by another.

Consider, for example, that the N extension is someday ratified by the RISC-V Association, and a hart implements both S mode and the N extension, with misa.N hardwired = 1. The OS in use on this hart might be oblivious to the N extension and hence might not test for the extension or pay any attention to the eight CSRs it adds to the ISA:
ustatus, uie, utvec, uscratch, uepc, ucause, utval, and uip. In that case, most of these CSRs provide an obvious covert channel between user threads. Although traditional practices might consider such a communication channel harmless, the intense focus on security today argues that a means be offered to plug such channels.

The F registers of the RISC-V floating-point extensions and the V registers of the vector extension would also be potential covert channels, except for the existence of the FS and VS fields in the sstatus register. An OS that is unaware of, say, the vector extension and its V registers will unwittingly prevent access to those registers by initializing unknown fields of sstatus to zeros, which in this case will include the VS field.

Obviously, one way to prevent the use of the N extension's CSRs as a covert channel would be to add to sstatus an "NS" field for the N extension, paralleling the V extension's VS field. However, this is not considered a general solution to the problem due to the number of potential future extensions that may add small amounts of state. Even with a 64-bit sstatus (necessitating adding sstatush for RV32), it is not certain there are enough remaining bits in sstatus to accommodate all future extensions. In any event, there is no need to strain sstatus (and add sstatush) for this purpose. The "enable" flags that are needed to plug covert channels are not generally expected to require swapping on context switches of user threads, making them a less-than-compelling candidate for inclusion in sstatus. Hence, a new place is proposed for them instead.

--------------------
Proposal

RV64 harts that conform to the RISC-V Privileged Architecture may optionally implement four new 64-bit CSRs at machine level, listed with their CSR addresses:

0x30C mstateen0 (Machine State Enable 0)
0x30D mstateen1
0x30E mstateen2
0x30F mstateen3

If supervisor mode is implemented, another four CSRs would be defined at supervisor level:

0x10C sstateen0
0x10D sstateen1
0x10E sstateen2
0x10F sstateen3

And if the hypervisor extension is implemented, another set of CSRs is
added:

0x60C hstateen0
0x60D hstateen1
0x60E hstateen2
0x60F hstateen3

If any "stateen" CSRs is implemented, they must all be implemented for the respective modes. For RV32, the registers listed above are 32-bit, and for the machine-level and hypervisor CSRs there is a corresponding set of high-half CSRs for the upper 32 bits of each register:

0x31C mstateen0h
0x31D mstateen1h
0x31E mstateen2h
0x31F mstateen3h

0x61C hstateen0h
0x61D hstateen1h
0x61E hstateen2h
0x61F hstateen3h

For the sstateen registers, high-half CSRs are not added at this time because it is expected the upper 32 bits of these registers will always be zeros, as explained later below.

The number of registers provided at each level is four because it is believed that 4 * 64 = 256 bits for machine and hypervisor levels, and
4 * 32 = 128 bits for supervisor level, will be adequate for many years to come, perhaps for as long as the RISC-V ISA is in use. The exact number four is an attempted compromise between providing too few bits on the one hand and going overboard with CSRs that will never be used on the other. A possible future doubling of the number of stateen CSRs is discussed later.

The registers at each level control access to state at all lower privilege levels, but not at its own level. This is analogous to the function of the existing counteren CSRs.

Each bit of an sstateen CSR controls user-level access (from U mode or VU mode) to an extension's state, for an extension that was not deemed "worthy" of a full XS field in sstatus like the FS and VS fields for the F and V extensions. The intention is to allocate the bits of sstateen CSRs starting at the least-significant end, bit 0, through to bit 31, and then on to the next-higher sstateen CSR.

For every bit with a defined purpose in an sstateen CSR, the same bit is defined in the matching mstateen CSR to control access below machine level to the same state. The upper 32 bits of an mstateen CSR (or for RV32, the corresponding high-half CSR) control access to state that is inherently inaccessible to user level, so no corresponding enable bits in the supervisor-level sstateen CSR are applicable. The intention is to allocate bits for this purpose starting at the most-significant end, bit 63, through to bit 32, and then on to the next-higher mstateen CSR.
If the rate that bits are being allocated from the least-significant end for sstateen CSRs is sufficiently low, allocation from the most- significant end of mstateen CSRs may be allowed to encroach on the lower 32 bits before jumping to the next-higher mstateen CSR. In that case, the bit positions of "encroaching" bits will remain forever read-only zeros in the matching sstateen CSRs.

With the hypervisor extension, the hstateen CSRs have identical encoding to the mstateen CSRs, except controlling accesses for a virtual machine (from VS and VU modes).

Bits in any stateen CSR that are defined to control state that a hart doesn't implement are read-only zeros for that hart. Likewise, all reserved bits not yet given a defined meaning are also read-only zeros.
For every bit in an mstateen CSR that is zero (whether read-only zero or set to zero), the same bit appears as read-only zero in the matching hstateen and sstateen CSRs. For every bit in an hstateen CSR that is zero (whether read-only zero or set to zero), the same bit appears as read-only zero in sstateen when accessed from a virtual machine.

On reset, all mstateen bits are initialized by the hardware to zeros.
If machine-level software changes these values, it is responsible for initializing the matching hstateen and sstateen CSRs to zeros too.
Software at each privilege level should set its respective stateen CSRs to indicate the state it is prepared to allow lower-privilege software to access. For OSes and hypervisors, this usually means the state that the OS or hypervisor is prepared to swap on a context switch, or to manage in some other way.

Implementing the stateen CSRs is optional for harts (though platform standards can always make them mandatory). When the stateen CSRs are not implemented, all state added by an extension is accessible as defined by that extension.

For each mstateen and hstateen CSR, bit 63 is defined to control access to the matching supervisor-level sstateen CSR. That is, bit 63 of mstateen0 and hstateen0 controls access to sstateen0; bit 63 of mstateen1 and hstateen1 controls access to sstateen1; etc. A hypervisor may need this control over accesses to the sstateen CSRs if it ever must emulate for a virtual machine an extension that is supposed to be affected by a bit in an sstateen CSR. (Even if such emulation is uncommon, it shouldn't be excluded.) Machine-level software needs identical control to be able to emulate the hypervisor extension. (That is, machine level needs control over accesses to the supervisor-level sstaten CSRs in order to emulate the hstateen CSRs, which have such control.)

If the hypervisor extension is not implemented and a supervisor-level sstateen CSR is all read-only zeros, an implementation may make bit 63 of the matching mstateen read-only zero. In that case, machine-level software should preferably emulate attempts to access the affected sstateen CSR from S mode, ignoring writes and returning zero for reads.

Initially, the following bits are proposed to be defined in sstaten0, mstateen0, and hstateen0:

bit 0 QUERY data
bit 1 fcsr for Zfinx and related extensions (Zdinx, etc.)
bit 2 Tentatively reserved for the N extension

As a special case, bit 0 is used to control access to the information returned by the optional QUERY instruction, even though this cannot act as a covert channel between user threads or guest OSes. For more, see the documentation for the QUERY instruction.

Bit 1 applies only for the case when floating-point instructions operate on X registers instead of F registers. Whenever misa.F = 1, bit 1 of mstateen0 is read-only zero (and hence read-only zero in
hstateen0 and sstateen0 too). For convenience, when the stateen CSRs are implemented and misa.F = 0, then if bit 1 of a relevant stateen0 CSR is zero, _all_ floating-point instructions cause an illegal instruction trap (or possibly a virtual instruction trap, if executed in a virtual machine), as though they all touch fcsr, regardless of whether they really do.

In addition to the bits listed above for user-accessible state, the following are also proposed initially for mstateen0 and hstateen0:

bit 61 Reserved for the RISC-V Advanced Interrupt Architecture
bit 62 stimecmp, vstimecmp of Sstc extension
bit 63 sstaten0

--------------------
Usage

After the machine-level mstateen CSRs are all initialized to zeros on reset, machine-level software can set bits in these registers to enable lower-privilege access to the controlled state. This may be either because machine-level software knows how to swap the state or, more likely, because machine-level software isn't swapping supervisor- level environments. (Recall that the main reason the mstateen CSRs must exist is so machine level can emulate the hypervisor extension.
When machine level isn't emulating the hypervisor extension, it is likely there will be no need to keep any mstateen bits zero.)

If machine level sets any mstateen bits to nonzero, it must initialize the matching hstateen CSRs to zeros if the hypervisor extension is implemented. And if any mstateen bits that are set to one have matching bits in the sstateen CSRs, machine-level software must initialize those sstateen CSRs to zeros. Ordinarily, machine-level software will want to set bit 63 of each mstateen CSR, necessitating that it zero all hstateen and sstateen CSRs.

An OS at supervisor level should see the sstateen CSRs initialized to zeros when the OS starts. It can set bits in these registers to enable user-level access to the controlled state, presumably because the OS knows how to context-swap the state.

A hypervisor must include the sstateen CSRs in the context it swaps for a guest OS. When it starts a new guest OS, it must ensure the sstaten CSRs are initialized to zeros.

In many cases, the various bits of the stateen CSRs will have a dual purpose as enables for the ISA extensions that introduce the controlled state.

--------------------
Possible expansion

If a need is anticipated, the set of stateen CSRs could in the future be doubled by adding these:

0x38C mstateen4 0x39C mstateen4h
0x38D mstateen5 0x39D mstateen5h
0x38E mstateen6 0x39E mstateen6h
0x38F mstateen7 0x39F mstateen7h

0x18C sstateen4
0x18D sstateen5
0x18E sstateen6
0x18F sstateen7

0x68C hstateen4 0x69C hstateen4h
0x68D hstateen5 0x69D hstateen5h
0x68E hstateen6 0x69E hstateen6h
0x68F hstateen7 0x69F hstateen7h

These additional CSRs are not a definite part of the original proposal because it is unclear whether they will ever be needed, and it is believed the rate of consumption of bits in the first group, numbers 0-3, will be slow enough that any looming shortage will be perceptible many years in advance. At the moment, it is not known even how many years it may take to exhaust just mstateen0, sstateen0, and hstateen0.


Re: proposal for stateen CSRs

John Hauser
 

I wrote:
The proposal below has been discussed by some of the principle RISC-V
architects for incorporation into the official Privileged Architecture.
"Principal".

- John Hauser


proposal for stateen CSRs

John Hauser
 

Hello tech-privileged,

The proposal below has been discussed by some of the principle RISC-V
architects for incorporation into the official Privileged Architecture.

The text below makes reference to Zfinx, Sstc, a QUERY instruction, and
the RISC-V Advanced Interrupt Architecture, which are other extensions
being separately developed.

- John H.


----------------------------------------

The following is a proposal for an addition to the main Privileged
Architecture (not a separately named extension).

--------------------
Motivation

Currently, the implementation of optional RISC-V extensions has the
potential to open covert channels between separate user threads, or
between separate guest OSes running under a hypervisor. The problem
occurs when an extension adds processor state---usually explicit
registers, but possibly other forms of state---that the main OS or
hypervisor is unaware of (and hence won't context-switch) but that
can be modified/written by one user thread or guest OS and perceived/
examined/read by another.

Consider, for example, that the N extension is someday ratified by
the RISC-V Association, and a hart implements both S mode and the
N extension, with misa.N hardwired = 1. The OS in use on this hart
might be oblivious to the N extension and hence might not test for the
extension or pay any attention to the eight CSRs it adds to the ISA:
ustatus, uie, utvec, uscratch, uepc, ucause, utval, and uip. In that
case, most of these CSRs provide an obvious covert channel between
user threads. Although traditional practices might consider such a
communication channel harmless, the intense focus on security today
argues that a means be offered to plug such channels.

The F registers of the RISC-V floating-point extensions and the
V registers of the vector extension would also be potential covert
channels, except for the existence of the FS and VS fields in the
sstatus register. An OS that is unaware of, say, the vector extension
and its V registers will unwittingly prevent access to those registers
by initializing unknown fields of sstatus to zeros, which in this case
will include the VS field.

Obviously, one way to prevent the use of the N extension's CSRs as
a covert channel would be to add to sstatus an "NS" field for the
N extension, paralleling the V extension's VS field. However, this is
not considered a general solution to the problem due to the number of
potential future extensions that may add small amounts of state. Even
with a 64-bit sstatus (necessitating adding sstatush for RV32), it is
not certain there are enough remaining bits in sstatus to accommodate
all future extensions. In any event, there is no need to strain
sstatus (and add sstatush) for this purpose. The "enable" flags
that are needed to plug covert channels are not generally expected to
require swapping on context switches of user threads, making them a
less-than-compelling candidate for inclusion in sstatus. Hence, a new
place is proposed for them instead.

--------------------
Proposal

RV64 harts that conform to the RISC-V Privileged Architecture may
optionally implement four new 64-bit CSRs at machine level, listed with
their CSR addresses:

0x30C mstateen0 (Machine State Enable 0)
0x30D mstateen1
0x30E mstateen2
0x30F mstateen3

If supervisor mode is implemented, another four CSRs would be defined
at supervisor level:

0x10C sstateen0
0x10D sstateen1
0x10E sstateen2
0x10F sstateen3

And if the hypervisor extension is implemented, another set of CSRs is
added:

0x60C hstateen0
0x60D hstateen1
0x60E hstateen2
0x60F hstateen3

If any "stateen" CSRs is implemented, they must all be implemented for
the respective modes. For RV32, the registers listed above are 32-bit,
and for the machine-level and hypervisor CSRs there is a corresponding
set of high-half CSRs for the upper 32 bits of each register:

0x31C mstateen0h
0x31D mstateen1h
0x31E mstateen2h
0x31F mstateen3h

0x61C hstateen0h
0x61D hstateen1h
0x61E hstateen2h
0x61F hstateen3h

For the sstateen registers, high-half CSRs are not added at this time
because it is expected the upper 32 bits of these registers will always
be zeros, as explained later below.

The number of registers provided at each level is four because it is
believed that 4 * 64 = 256 bits for machine and hypervisor levels, and
4 * 32 = 128 bits for supervisor level, will be adequate for many years
to come, perhaps for as long as the RISC-V ISA is in use. The exact
number four is an attempted compromise between providing too few bits
on the one hand and going overboard with CSRs that will never be used
on the other. A possible future doubling of the number of stateen CSRs
is discussed later.

The registers at each level control access to state at all lower
privilege levels, but not at its own level. This is analogous to the
function of the existing counteren CSRs.

Each bit of an sstateen CSR controls user-level access (from U mode or
VU mode) to an extension's state, for an extension that was not deemed
"worthy" of a full XS field in sstatus like the FS and VS fields
for the F and V extensions. The intention is to allocate the bits of
sstateen CSRs starting at the least-significant end, bit 0, through to
bit 31, and then on to the next-higher sstateen CSR.

For every bit with a defined purpose in an sstateen CSR, the same bit
is defined in the matching mstateen CSR to control access below machine
level to the same state. The upper 32 bits of an mstateen CSR (or for
RV32, the corresponding high-half CSR) control access to state that is
inherently inaccessible to user level, so no corresponding enable bits
in the supervisor-level sstateen CSR are applicable. The intention is
to allocate bits for this purpose starting at the most-significant end,
bit 63, through to bit 32, and then on to the next-higher mstateen CSR.
If the rate that bits are being allocated from the least-significant
end for sstateen CSRs is sufficiently low, allocation from the most-
significant end of mstateen CSRs may be allowed to encroach on the
lower 32 bits before jumping to the next-higher mstateen CSR. In
that case, the bit positions of "encroaching" bits will remain forever
read-only zeros in the matching sstateen CSRs.

With the hypervisor extension, the hstateen CSRs have identical
encoding to the mstateen CSRs, except controlling accesses for a
virtual machine (from VS and VU modes).

Bits in any stateen CSR that are defined to control state that a hart
doesn't implement are read-only zeros for that hart. Likewise, all
reserved bits not yet given a defined meaning are also read-only zeros.
For every bit in an mstateen CSR that is zero (whether read-only zero
or set to zero), the same bit appears as read-only zero in the matching
hstateen and sstateen CSRs. For every bit in an hstateen CSR that is
zero (whether read-only zero or set to zero), the same bit appears as
read-only zero in sstateen when accessed from a virtual machine.

On reset, all mstateen bits are initialized by the hardware to zeros.
If machine-level software changes these values, it is responsible for
initializing the matching hstateen and sstateen CSRs to zeros too.
Software at each privilege level should set its respective stateen CSRs
to indicate the state it is prepared to allow lower-privilege software
to access. For OSes and hypervisors, this usually means the state that
the OS or hypervisor is prepared to swap on a context switch, or to
manage in some other way.

Implementing the stateen CSRs is optional for harts (though platform
standards can always make them mandatory). When the stateen CSRs
are not implemented, all state added by an extension is accessible as
defined by that extension.

For each mstateen and hstateen CSR, bit 63 is defined to control
access to the matching supervisor-level sstateen CSR. That is,
bit 63 of mstateen0 and hstateen0 controls access to sstateen0; bit 63
of mstateen1 and hstateen1 controls access to sstateen1; etc. A
hypervisor may need this control over accesses to the sstateen CSRs
if it ever must emulate for a virtual machine an extension that is
supposed to be affected by a bit in an sstateen CSR. (Even if such
emulation is uncommon, it shouldn't be excluded.) Machine-level
software needs identical control to be able to emulate the hypervisor
extension. (That is, machine level needs control over accesses to the
supervisor-level sstaten CSRs in order to emulate the hstateen CSRs,
which have such control.)

If the hypervisor extension is not implemented and a supervisor-level
sstateen CSR is all read-only zeros, an implementation may make bit 63
of the matching mstateen read-only zero. In that case, machine-level
software should preferably emulate attempts to access the affected
sstateen CSR from S mode, ignoring writes and returning zero for reads.

Initially, the following bits are proposed to be defined in sstaten0,
mstateen0, and hstateen0:

bit 0 QUERY data
bit 1 fcsr for Zfinx and related extensions (Zdinx, etc.)
bit 2 Tentatively reserved for the N extension

As a special case, bit 0 is used to control access to the information
returned by the optional QUERY instruction, even though this cannot act
as a covert channel between user threads or guest OSes. For more, see
the documentation for the QUERY instruction.

Bit 1 applies only for the case when floating-point instructions
operate on X registers instead of F registers. Whenever misa.F = 1,
bit 1 of mstateen0 is read-only zero (and hence read-only zero in
hstateen0 and sstateen0 too). For convenience, when the stateen CSRs
are implemented and misa.F = 0, then if bit 1 of a relevant stateen0
CSR is zero, _all_ floating-point instructions cause an illegal
instruction trap (or possibly a virtual instruction trap, if executed
in a virtual machine), as though they all touch fcsr, regardless of
whether they really do.

In addition to the bits listed above for user-accessible state, the
following are also proposed initially for mstateen0 and hstateen0:

bit 61 Reserved for the RISC-V Advanced Interrupt Architecture
bit 62 stimecmp, vstimecmp of Sstc extension
bit 63 sstaten0

--------------------
Usage

After the machine-level mstateen CSRs are all initialized to zeros
on reset, machine-level software can set bits in these registers to
enable lower-privilege access to the controlled state. This may be
either because machine-level software knows how to swap the state or,
more likely, because machine-level software isn't swapping supervisor-
level environments. (Recall that the main reason the mstateen CSRs
must exist is so machine level can emulate the hypervisor extension.
When machine level isn't emulating the hypervisor extension, it is
likely there will be no need to keep any mstateen bits zero.)

If machine level sets any mstateen bits to nonzero, it must initialize
the matching hstateen CSRs to zeros if the hypervisor extension
is implemented. And if any mstateen bits that are set to one have
matching bits in the sstateen CSRs, machine-level software must
initialize those sstateen CSRs to zeros. Ordinarily, machine-level
software will want to set bit 63 of each mstateen CSR, necessitating
that it zero all hstateen and sstateen CSRs.

An OS at supervisor level should see the sstateen CSRs initialized to
zeros when the OS starts. It can set bits in these registers to enable
user-level access to the controlled state, presumably because the OS
knows how to context-swap the state.

A hypervisor must include the sstateen CSRs in the context it swaps for
a guest OS. When it starts a new guest OS, it must ensure the sstaten
CSRs are initialized to zeros.

In many cases, the various bits of the stateen CSRs will have a dual
purpose as enables for the ISA extensions that introduce the controlled
state.

--------------------
Possible expansion

If a need is anticipated, the set of stateen CSRs could in the future
be doubled by adding these:

0x38C mstateen4 0x39C mstateen4h
0x38D mstateen5 0x39D mstateen5h
0x38E mstateen6 0x39E mstateen6h
0x38F mstateen7 0x39F mstateen7h

0x18C sstateen4
0x18D sstateen5
0x18E sstateen6
0x18F sstateen7

0x68C hstateen4 0x69C hstateen4h
0x68D hstateen5 0x69D hstateen5h
0x68E hstateen6 0x69E hstateen6h
0x68F hstateen7 0x69F hstateen7h

These additional CSRs are not a definite part of the original proposal
because it is unclear whether they will ever be needed, and it is
believed the rate of consumption of bits in the first group, numbers
0-3, will be slow enough that any looming shortage will be perceptible
many years in advance. At the moment, it is not known even how many
years it may take to exhaust just mstateen0, sstateen0, and hstateen0.


Re: [RISC-V] [tech-tee] The proposal of sPMP

Anup Patel
 

Most hypervisor will always have G-stage (Stage2) programmed for each Guest/VM irrespective whether VS-mode MMU is on or off. This means the G-stage level protection is always there for Guest/VM. When a Guest/VM does invalid physical address access, the hypervisor catches it as trap. Most hypervisor will always stop/dump the Guest/VM when it misbehaves and possibly even report the event to some cloud management software. I don’t see how sPMP fits in virtualization scenario considering a G-stage is already available. Things will get even more complicated for nested virtualization if a hypervisor chooses to expose sPMP to the Guest hypervisor.

 

The only utility of sPMP seems to be letting HS-mode handle invalid access trap directly rather than being routed through M-mode firmware.

 

I think it is best to just re-purpose page fault exception codes for sPMP whenever sPMP is available.

 

Regards,

Anup

 

From: tech-privileged@... <tech-privileged@...> On Behalf Of Dong Du
Sent: 20 April 2021 07:32
To: Nick Kossifidis <mick@...>; Bill Huffman <huffman@...>
Cc: Paolo Bonzini <pbonzini@...>; tech-privileged <tech-privileged@...>; tech-tee <tech-tee@...>
Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP

 

Yes, it's reasonable as both sPMP and paging provides memory protection that is managed by the S-mode.

 

Following the idea,  suggestions by Bill and Jonathan to rename exception codes 12, 13, and 15 to "page fault / sPMP fault" or "S-mode access fault" do make sense.

 

 

All the best,

Dong

 

 

------------------ Original ------------------

Date:  Tue, Apr 20, 2021 09:06 AM

To:  "Bill Huffman"<huffman@...>;

Cc:  "Paolo Bonzini"<pbonzini@...>; "tech-privileged"<tech-privileged@...>;

Subject:  Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP

 

Στις 2021-04-20 00:37, Bill Huffman έγραψε:
> Perhaps we should rename the sPMP to something that sounds more like
> the simplified page table idea.  Maybe Memory Protection Unit.  The
> fact that it resembles PMP doesn't have to appear in the name.
>

Good point, if we end up treating it differently than PMA/PMP it makes
sense to name it differently as well.

Regards,
Nick





Re: [RISC-V] [tech-tee] The proposal of sPMP

Dong Du
 

Yes, it's reasonable as both sPMP and paging provides memory protection that is managed by the S-mode.

Following the idea,  suggestions by Bill and Jonathan to rename exception codes 12, 13, and 15 to "page fault / sPMP fault" or "S-mode access fault" do make sense.


All the best,
Dong
 
 
------------------ Original ------------------
Date:  Tue, Apr 20, 2021 09:06 AM
To:  "Bill Huffman"<huffman@...>;
Cc:  "Paolo Bonzini"<pbonzini@...>; "tech-privileged"<tech-privileged@...>;
Subject:  Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP
 
Στις 2021-04-20 00:37, Bill Huffman έγραψε:
> Perhaps we should rename the sPMP to something that sounds more like
> the simplified page table idea.  Maybe Memory Protection Unit.  The
> fact that it resembles PMP doesn't have to appear in the name.
>

Good point, if we end up treating it differently than PMA/PMP it makes
sense to name it differently as well.

Regards,
Nick






Re: [RISC-V] [tech-tee] The proposal of sPMP

Nick Kossifidis
 

Στις 2021-04-20 00:37, Bill Huffman έγραψε:
Perhaps we should rename the sPMP to something that sounds more like
the simplified page table idea. Maybe Memory Protection Unit. The
fact that it resembles PMP doesn't have to appear in the name.
Good point, if we end up treating it differently than PMA/PMP it makes sense to name it differently as well.

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Nick Kossifidis
 

Στις 2021-04-20 03:31, Jonathan Behrens έγραψε:

On Mon, Apr 19, 2021 at 8:21 PM Nick Kossifidis via lists.riscv.org <mick=ics.forth.gr@...> wrote:

Στις 2021-04-19 22:47, Jonathan Behrens έγραψε:

How about sPMP is only used if satp.MODE=BARE or virtualization is
enabled and hgatp.MODE=BARE? That would enable the trusted hypervisor
case, while disallowing an S-mode operating system from enabling both
paging and sPMP at the same time.
Jonathan
I'd be ok with this, but I don't see what we gain from it, in the end
the hw will still use sPMP + MMU in case the hypervisor uses sPMP and
the guest uses paging, because we still want sPMP to be active when
guest is running.
It for instance rules out the case where M-mode uses PMP, HS-mode uses sPMP and paging, and VS-mode also uses sPMP and paging. That would be 5 stages of protection for every access! (Which is way worse than it sounds because each level of walking a page table also has to be validated)
Ouch ! good point there, that's definitely a scenario that doesn't make sense.

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Nick Kossifidis
 

Στις 2021-04-20 03:14, Andrew Waterman έγραψε:

On Mon, Apr 19, 2021 at 5:05 PM Nick Kossifidis <mick@...> wrote:

Στις 2021-04-19 23:17, Bill Huffman έγραψε:
Nick,
For security reasons, PMP exceptions will almost always be handled in
M-mode. sPMP exceptions typically want to be handled in S-mode. I
don't think it ever makes sense to have the two types routed to the
same exception code. So that seems out to me.
I don't think it ever makes sense to handle PMP access faults in
S-mode. It's the wrong privilege level (and S-mode can't write the
PMP either).
We currently get access-fault exceptions in case anyone tries to access
a region protected with PMA or PMP, regardless of privilege mode. All
PMA and PMP faults from any mode are already routed to the same
exception codes anyway. Those exceptions are typically handled by
M-mode, and based on mstatus.MPP, firmware may forward them accordingly.
That's what OpenSBI currently does ->
https://github.com/riscv/opensbi/blob/master/lib/sbi/sbi_trap.c#L260,
and here is Linux triggering a segfault in sw upon receiving an
access-fault exception ->
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/traps.c?h=fixes#n102
without going though do_page_fault. I don't see why sPMP faults should
be treated any differently, we can always get an access fault with
mstatus.MPP = S/U from sPMP on M-mode and forward it to S-mode to handle
it, like we forward PMA or PMP faults originating from S/U-mode.
The requirement set by the proposal's authors was to use medeleg to
delegate sPMP faults to S-mode directly as page-faults, without going
through M-mode's trap handler as PMA/PMP access-faults do. Without MMU
support on the S-mode OS this won't work, on Linux for example
do_page_fault is only available when MMU support is enabled ->
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/mm/Makefile?h=fixes#n13,
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/entry.S?h=fixes#n449.
So to be clear, treating sPMP faults as page-faults requires the OS to
support MMU and expect page-faults, it's only possible when both sPMP
and MMU are used at the same time, and the scenario I mentioned
previously is also not covered by this approach (since the hypervisor
won't use the MMU).
That sounds like an arbitrary constraint based upon how our arch-specific Linux code is currently written. There's nothing wrong with changing our arch-specific Linux code to be sPMP-aware in the no-MMU case. That doesn't mean we need to compile in the do_page_fault routine and all of the MMU code that it depends on; we'd instead map the page-fault exceptions to a different subroutine for the no-MMU case.
Indeed we can just use the same code path as access-faults and rename the exception as Bill suggested, my question is why not just raise access-faults from sPMP like we do for PMA/PMP, in which case we don't need to change anything in the spec / code. Why do we need to treat sPMP differently than PMP or PMA ? The answer I have so far is to avoid going through M-mode and use medeleg instead, is this enough to justify such a change ?

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Jonathan Behrens <behrensj@...>
 



On Mon, Apr 19, 2021 at 8:21 PM Nick Kossifidis via lists.riscv.org <mick=ics.forth.gr@...> wrote:
Στις 2021-04-19 22:47, Jonathan Behrens έγραψε:

> How about sPMP is only used if satp.MODE=BARE or virtualization is
> enabled and hgatp.MODE=BARE? That would enable the trusted hypervisor
> case, while disallowing an S-mode operating system from enabling both
> paging and sPMP at the same time.
>
> Jonathan
>

I'd be ok with this, but I don't see what we gain from it, in the end
the hw will still use sPMP + MMU in case the hypervisor uses sPMP and
the guest uses paging, because we still want sPMP to be active when
guest is running.

It for instance rules out the case where M-mode uses PMP, HS-mode uses sPMP and paging, and VS-mode also uses sPMP and paging. That would be 5 stages of protection for every access! (Which is way worse than it sounds because each level of walking a page table also has to be validated)

Jonathan


Re: [RISC-V] [tech-tee] The proposal of sPMP

Nick Kossifidis
 

Στις 2021-04-19 22:47, Jonathan Behrens έγραψε:

How about sPMP is only used if satp.MODE=BARE or virtualization is enabled and hgatp.MODE=BARE? That would enable the trusted hypervisor case, while disallowing an S-mode operating system from enabling both paging and sPMP at the same time.
Jonathan
I'd be ok with this, but I don't see what we gain from it, in the end the hw will still use sPMP + MMU in case the hypervisor uses sPMP and the guest uses paging, because we still want sPMP to be active when guest is running.

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Jonathan Behrens <behrensj@...>
 

It seems like we could just rename exception codes 12, 13, and 15 to "page fault / sPMP fault" and be done with it.

Jonathan


On Mon, Apr 19, 2021 at 8:14 PM Andrew Waterman via lists.riscv.org <andrew=sifive.com@...> wrote:


On Mon, Apr 19, 2021 at 5:05 PM Nick Kossifidis <mick@...> wrote:
Στις 2021-04-19 23:17, Bill Huffman έγραψε:
> Nick,
>
> For security reasons, PMP exceptions will almost always be handled in
> M-mode.  sPMP exceptions typically want to be handled in S-mode.  I
> don't think it ever makes sense to have the two types routed to the
> same exception code.  So that seems out to me.
>
> I don't think it ever makes sense to handle PMP access faults in
> S-mode.  It's the wrong privilege level (and S-mode can't write the
> PMP either).
>

We currently get access-fault exceptions in case anyone tries to access
a region protected with PMA or PMP, regardless of privilege mode. All
PMA and PMP faults from any mode are already routed to the same
exception codes anyway. Those exceptions are typically handled by
M-mode, and based on mstatus.MPP, firmware may forward them accordingly.
That's what OpenSBI currently does ->
https://github.com/riscv/opensbi/blob/master/lib/sbi/sbi_trap.c#L260,
and here is Linux triggering a segfault in sw upon receiving an
access-fault exception ->
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/traps.c?h=fixes#n102
without going though do_page_fault. I don't see why sPMP faults should
be treated any differently, we can always get an access fault with
mstatus.MPP = S/U from sPMP on M-mode and forward it to S-mode to handle
it, like we forward PMA or PMP faults originating from S/U-mode.

The requirement set by the proposal's authors was to use medeleg to
delegate sPMP faults to S-mode directly as page-faults, without going
through M-mode's trap handler as PMA/PMP access-faults do. Without MMU
support on the S-mode OS this won't work, on Linux for example
do_page_fault is only available when MMU support is enabled ->
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/mm/Makefile?h=fixes#n13,
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/entry.S?h=fixes#n449.
So to be clear, treating sPMP faults as page-faults requires the OS to
support MMU and expect page-faults, it's only possible when both sPMP
and MMU are used at the same time, and the scenario I mentioned
previously is also not covered by this approach (since the hypervisor
won't use the MMU).

That sounds like an arbitrary constraint based upon how our arch-specific Linux code is currently written.  There's nothing wrong with changing our arch-specific Linux code to be sPMP-aware in the no-MMU case.  That doesn't mean we need to compile in the do_page_fault routine and all of the MMU code that it depends on; we'd instead map the page-fault exceptions to a different subroutine for the no-MMU case.

That's why we talked about using another cause value
for sPMP. If we drop the requirement to delegate sPMP faults through
medeleg, we can just treat them as access faults and be done with it, we
won't need to change any code as well, access-faults are expected by the
OS in any case.

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Nick Kossifidis
 

Στις 2021-04-19 23:01, Paolo Bonzini έγραψε:
On 19/04/21 21:23, Nick Kossifidis wrote:
A scenario we discussed at some point was a trusted hypervisor running on HS mode, with e.g. Linux and a trusted service running on VS mode. The trusted hypervisor is usually very small/simple and may not use paging, so hgatp will be set to bare and it'll fallback to PMP/ePMP as the current hypervisor spec mandates. With sPMP the hypervisor will be able to configure its own regions and also isolate Linux from the trusted service
I don't see much difference between setting up sPMP and a constant
page table for the guest. It's going to be a few dozens lines of code
that only runs once when the hypervisor starts.
Paolo
It's possible for an implementation to only support one-stage translation and hardwire hgatp to zero, in which case it makes sense for the hypervisor to have sPMP available.

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Andrew Waterman
 



On Mon, Apr 19, 2021 at 5:05 PM Nick Kossifidis <mick@...> wrote:
Στις 2021-04-19 23:17, Bill Huffman έγραψε:
> Nick,
>
> For security reasons, PMP exceptions will almost always be handled in
> M-mode.  sPMP exceptions typically want to be handled in S-mode.  I
> don't think it ever makes sense to have the two types routed to the
> same exception code.  So that seems out to me.
>
> I don't think it ever makes sense to handle PMP access faults in
> S-mode.  It's the wrong privilege level (and S-mode can't write the
> PMP either).
>

We currently get access-fault exceptions in case anyone tries to access
a region protected with PMA or PMP, regardless of privilege mode. All
PMA and PMP faults from any mode are already routed to the same
exception codes anyway. Those exceptions are typically handled by
M-mode, and based on mstatus.MPP, firmware may forward them accordingly.
That's what OpenSBI currently does ->
https://github.com/riscv/opensbi/blob/master/lib/sbi/sbi_trap.c#L260,
and here is Linux triggering a segfault in sw upon receiving an
access-fault exception ->
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/traps.c?h=fixes#n102
without going though do_page_fault. I don't see why sPMP faults should
be treated any differently, we can always get an access fault with
mstatus.MPP = S/U from sPMP on M-mode and forward it to S-mode to handle
it, like we forward PMA or PMP faults originating from S/U-mode.

The requirement set by the proposal's authors was to use medeleg to
delegate sPMP faults to S-mode directly as page-faults, without going
through M-mode's trap handler as PMA/PMP access-faults do. Without MMU
support on the S-mode OS this won't work, on Linux for example
do_page_fault is only available when MMU support is enabled ->
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/mm/Makefile?h=fixes#n13,
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/entry.S?h=fixes#n449.
So to be clear, treating sPMP faults as page-faults requires the OS to
support MMU and expect page-faults, it's only possible when both sPMP
and MMU are used at the same time, and the scenario I mentioned
previously is also not covered by this approach (since the hypervisor
won't use the MMU).

That sounds like an arbitrary constraint based upon how our arch-specific Linux code is currently written.  There's nothing wrong with changing our arch-specific Linux code to be sPMP-aware in the no-MMU case.  That doesn't mean we need to compile in the do_page_fault routine and all of the MMU code that it depends on; we'd instead map the page-fault exceptions to a different subroutine for the no-MMU case.

That's why we talked about using another cause value
for sPMP. If we drop the requirement to delegate sPMP faults through
medeleg, we can just treat them as access faults and be done with it, we
won't need to change any code as well, access-faults are expected by the
OS in any case.

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Nick Kossifidis
 

Στις 2021-04-19 23:17, Bill Huffman έγραψε:
Nick,
For security reasons, PMP exceptions will almost always be handled in
M-mode. sPMP exceptions typically want to be handled in S-mode. I
don't think it ever makes sense to have the two types routed to the
same exception code. So that seems out to me.
I don't think it ever makes sense to handle PMP access faults in
S-mode. It's the wrong privilege level (and S-mode can't write the
PMP either).
We currently get access-fault exceptions in case anyone tries to access a region protected with PMA or PMP, regardless of privilege mode. All PMA and PMP faults from any mode are already routed to the same exception codes anyway. Those exceptions are typically handled by M-mode, and based on mstatus.MPP, firmware may forward them accordingly. That's what OpenSBI currently does -> https://github.com/riscv/opensbi/blob/master/lib/sbi/sbi_trap.c#L260, and here is Linux triggering a segfault in sw upon receiving an access-fault exception -> https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/traps.c?h=fixes#n102 without going though do_page_fault. I don't see why sPMP faults should be treated any differently, we can always get an access fault with mstatus.MPP = S/U from sPMP on M-mode and forward it to S-mode to handle it, like we forward PMA or PMP faults originating from S/U-mode.

The requirement set by the proposal's authors was to use medeleg to delegate sPMP faults to S-mode directly as page-faults, without going through M-mode's trap handler as PMA/PMP access-faults do. Without MMU support on the S-mode OS this won't work, on Linux for example do_page_fault is only available when MMU support is enabled -> https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/mm/Makefile?h=fixes#n13, https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/tree/arch/riscv/kernel/entry.S?h=fixes#n449. So to be clear, treating sPMP faults as page-faults requires the OS to support MMU and expect page-faults, it's only possible when both sPMP and MMU are used at the same time, and the scenario I mentioned previously is also not covered by this approach (since the hypervisor won't use the MMU). That's why we talked about using another cause value for sPMP. If we drop the requirement to delegate sPMP faults through medeleg, we can just treat them as access faults and be done with it, we won't need to change any code as well, access-faults are expected by the OS in any case.

Regards,
Nick


Re: [RISC-V] [tech-tee] The proposal of sPMP

Allen Baum
 

Relevant word here is "ranges" - 
PMPs (can) have ranges, explicitly defined in CSRs and don't modify the generated address - all they do is check permissions.
MMUs have 4K pages (or superpages) defined in a memory structure (but can be cached in a TLB).
 IF you need arbitrary sized ranges, that gets expensive in terms of the # of entries needed
 They modify the generated address, and must walk down levels of page tables if not found in the TLB, keep track of A & D bits (if you're using std RISC-V PTEs).
That means an extra request port to memory.
 They certainly seem harder to build to me. from a logical standpoint. I can't comment on the physical design effort.


On Mon, Apr 19, 2021 at 3:15 PM Bill Huffman <huffman@...> wrote:
They're not hard to build.  The point is eliminating the miss time, reducing hardware cost, and improving the granularity issues by having a few address ranges controllable by S-mode.

-----Original Message-----
From: Paolo Bonzini <pbonzini@...>
Sent: Monday, April 19, 2021 6:11 PM
To: Bill Huffman <huffman@...>; tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP

EXTERNAL MAIL


On 19/04/21 23:37, Bill Huffman wrote:
> Perhaps we should rename the sPMP to something that sounds more like
> the simplified page table idea.  Maybe Memory Protection Unit.  The
> fact that it resembles PMP doesn't have to appear in the name.

Or just use page tables.  I am not convinced that they're so hard to build.

Paolo







Re: [RISC-V] [tech-tee] The proposal of sPMP

Greg Favor
 

On Mon, Apr 19, 2021 at 2:37 PM Bill Huffman <huffman@...> wrote:
Perhaps we should rename the sPMP to something that sounds more like the simplified page table idea.  Maybe Memory Protection Unit.  The fact that it resembles PMP doesn't have to appear in the name.

Given the use cases for "sPMP" broadly, and for use of a combination of "sPMP" and paging (in a virtualized setting), the MPU label provides a more appropriate conceptualization of what this proposal is about.  And, as such, it would be appropriate for the MPU to re-use the page fault exception codes for its more block/region oriented faults.  In essence the "page fault" exception codes are supervisor-level MMU exception codes that would be appropriate to use for both page-based and block/region-based MMUs.  They are both forms of MMUs.  Just think of "page fault" exception codes more generally as "Supervisor MMU fault" exception codes.

Btw, if this proposal is going to actually support HS/U-mode MPU usage in conjunction with VS/VU-mode paged-based MMU usage, then the interaction details (when in VS/VU modes) need to be worked out and made part of the proposed spec.  And ditto for what is loaded into each of the exception-related CSRs on an MPU fault while V=1 and while V=0.   This probably also means that an operating environment with a hypervisor using an MPU cannot itself be virtualized (unless the higher-level hypervisor starts emulating a guest MPU using G-stage page tables or a host MPU?).

Greg


Re: [RISC-V] [tech-tee] The proposal of sPMP

Bill Huffman
 

They're not hard to build. The point is eliminating the miss time, reducing hardware cost, and improving the granularity issues by having a few address ranges controllable by S-mode.

-----Original Message-----
From: Paolo Bonzini <pbonzini@...>
Sent: Monday, April 19, 2021 6:11 PM
To: Bill Huffman <huffman@...>; tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP

EXTERNAL MAIL


On 19/04/21 23:37, Bill Huffman wrote:
Perhaps we should rename the sPMP to something that sounds more like
the simplified page table idea. Maybe Memory Protection Unit. The
fact that it resembles PMP doesn't have to appear in the name.
Or just use page tables. I am not convinced that they're so hard to build.

Paolo


Re: [RISC-V] [tech-tee] The proposal of sPMP

Paolo Bonzini
 

On 19/04/21 23:37, Bill Huffman wrote:
Perhaps we should rename the sPMP to something that sounds more like
the simplified page table idea. Maybe Memory Protection Unit. The
fact that it resembles PMP doesn't have to appear in the name.
Or just use page tables. I am not convinced that they're so hard to build.

Paolo


Re: [RISC-V] [tech-tee] The proposal of sPMP

Bill Huffman
 

Perhaps we should rename the sPMP to something that sounds more like the simplified page table idea. Maybe Memory Protection Unit. The fact that it resembles PMP doesn't have to appear in the name.

Bill

-----Original Message-----
From: tech-privileged@... <tech-privileged@...> On Behalf Of Paolo Bonzini
Sent: Monday, April 19, 2021 5:28 PM
To: tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] [RISC-V] [tech-tee] The proposal of sPMP

EXTERNAL MAIL


On 19/04/21 22:17, Bill Huffman wrote:
Nick,

For security reasons, PMP exceptions will almost always be handled in
M-mode. sPMP exceptions typically want to be handled in S-mode. I
don't think it ever makes sense to have the two types routed to the
same exception code. So that seems out to me.

I don't think it ever makes sense to handle PMP access faults in
S-mode. It's the wrong privilege level (and S-mode can't write the
PMP either).

If we don't allow sPMP and paging at the same time, then the two can
use the same exception codes. Perhaps the only issue is the name.
The name "page fault" doesn't seem to fit with an sPMP exception. If
that's the issue, then let's see if we can solve that.
It really seems to me that the sPMP is not really a PMP but a simplified page table format (which vaguely reminds me of PowerPC BAT registers)...

Paolo


Re: [RISC-V] [tech-tee] The proposal of sPMP

Paolo Bonzini
 

On 19/04/21 22:17, Bill Huffman wrote:
Nick,
For security reasons, PMP exceptions will almost always be handled in
M-mode. sPMP exceptions typically want to be handled in S-mode. I
don't think it ever makes sense to have the two types routed to the
same exception code. So that seems out to me.
I don't think it ever makes sense to handle PMP access faults in
S-mode. It's the wrong privilege level (and S-mode can't write the
PMP either).
If we don't allow sPMP and paging at the same time, then the two can
use the same exception codes. Perhaps the only issue is the name.
The name "page fault" doesn't seem to fit with an sPMP exception. If
that's the issue, then let's see if we can solve that.
It really seems to me that the sPMP is not really a PMP but a simplified page table format (which vaguely reminds me of PowerPC BAT registers)...

Paolo

701 - 720 of 1274