Re: PMP shared permissions for S and U

Allen Baum

Not all OSes require or desire VM address mapping, especially in the embedded space (or so I've been led to believe) - yet they still need some level of privilege protection layering.
That's where this comes SPMP (or SMPU) comes in. M-mode above the Smode is for security monitors, boot firmware, etc. - and to protect the Smode OS, so that doesn't go away.

On Tue, Aug 23, 2022 at 7:06 AM Anthony Coulter <riscv@...> wrote:
Out of curiosity, what's the point of supporting S-mode if not to use paged virtual memory? My understanding of things was that S-mode provides virtual memory, a secondary level of interrupt/exception handling beneath M-mode (which can already handle interrupts and exceptions by itself), and nothing else. I also understood that S-mode had restricted access to the machine's hardware mainly to support virtualization---the idea is that it uses an interface that's designed to be easily intercepted (per the Popek-Goldberg rules on virtualization). I also assume (which may be wrong) that MMU-less systems are MMU-less because they can't afford the transistors and power, so I would not expect them to run hypervisors.

So if the problem is "PMP doesn't distinguish between S-mode and U-mode" my reflexive solution would be "Run your supervisor in M-mode, which has the added benefit of giving it direct access to the PMP registers."

Apparently there is enough demand for supervisor-mode PMPs that there's an SMPU task group and an Ssmpu spec in the works at https://github.com/riscv/riscv-tee/blob/main/Ssmpu/Ssmpu.pdf . But all the spec provides for motivation is that "it is desirable to enable S-mode OS to limit the physical addresseses accessible by U-mode software." Figure 1 in that document says that the M-mode monitor sits below the S-mode "MPU virtualization," but... why? Is the purpose to allow hypervisors to manage MPU entries? Is there more than that? And if so, why are people running hypervisors on systems that can't afford paged memory? I'm sorry if this is derailing the purpose of this discussion thread, but I'm genuinely curious.

Regards,
Anthony

Re: PMP shared permissions for S and U

Anthony Coulter

Out of curiosity, what's the point of supporting S-mode if not to use paged virtual memory? My understanding of things was that S-mode provides virtual memory, a secondary level of interrupt/exception handling beneath M-mode (which can already handle interrupts and exceptions by itself), and nothing else. I also understood that S-mode had restricted access to the machine's hardware mainly to support virtualization---the idea is that it uses an interface that's designed to be easily intercepted (per the Popek-Goldberg rules on virtualization). I also assume (which may be wrong) that MMU-less systems are MMU-less because they can't afford the transistors and power, so I would not expect them to run hypervisors.

So if the problem is "PMP doesn't distinguish between S-mode and U-mode" my reflexive solution would be "Run your supervisor in M-mode, which has the added benefit of giving it direct access to the PMP registers."

Apparently there is enough demand for supervisor-mode PMPs that there's an SMPU task group and an Ssmpu spec in the works at https://github.com/riscv/riscv-tee/blob/main/Ssmpu/Ssmpu.pdf . But all the spec provides for motivation is that "it is desirable to enable S-mode OS to limit the physical addresseses accessible by U-mode software." Figure 1 in that document says that the M-mode monitor sits below the S-mode "MPU virtualization," but... why? Is the purpose to allow hypervisors to manage MPU entries? Is there more than that? And if so, why are people running hypervisors on systems that can't afford paged memory? I'm sorry if this is derailing the purpose of this discussion thread, but I'm genuinely curious.

Regards,
Anthony

Re: PMP shared permissions for S and U

Greg Favor

Note: I'm not sure if the TG's email list is up yet.  If not, check with the Security HC.

On Mon, Aug 22, 2022 at 10:02 AM Jeff Scott <jeff.scott@...> wrote:

Hi Greg,

In our world we don’t have MMU, just PMP.  The inability to supply separate permissions to S and U limits the usefulness of PMP in our environment.

I subscribed to SPMP now.  I’ll send my question to that email list as well.

Thanks,

Jeff

From: Greg Favor <gfavor@...>
Sent: Monday, August 22, 2022 11:53 AM
To: Jeff Scott <jeff.scott@...>
Cc: tech-privileged@...
Subject: [EXT] Re: [RISC-V] [tech-privileged] PMP shared permissions for S and U

Caution: EXT Email

PMP was architected to be a mechanism to protect M-mode software and resources from non-M-mode software and devices.  This complements the MMU which serves to protect and isolate between S-mode and U-mode (and between U-mode processes).  They are intended to be orthogonal and composable architectural components.

If you are looking for something that combines those two functions into one mechanism, then take a look at the new SPMP TG that is in the process of being formed as we speak.  I'm not certain, but I think that might be what you are looking for.  (Start of group meetings, etc. will be announced on relevant RVI email lists - including the Security HC's list since I believe this TG is being sponsored by that HC.)

Greg

On Mon, Aug 22, 2022 at 9:44 AM Jeff Scott <jeff.scott@...> wrote:

Hello,

I am curious why the PMP treats S and U mode accesses identically?  Is anyone aware of a standard extension that allows for different permissions for S and U?

Thanks,

Jeff

Re: PMP shared permissions for S and U

Manuel Offenberg

Jeff;

FYI. SPMP TG is awaiting final approval from Technical Steering Commitee.

Regards,
Manuel Offenberg
Seagate Research

From: tech-privileged@... <tech-privileged@...> on behalf of Jeff Scott <jeff.scott@...>
Sent: Monday, August 22, 2022 10:02 AM
To: Greg Favor <gfavor@...>
Cc: tech-privileged@... <tech-privileged@...>
Subject: Re: [RISC-V] [tech-privileged] PMP shared permissions for S and U

 This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.

Hi Greg,

In our world we don’t have MMU, just PMP.  The inability to supply separate permissions to S and U limits the usefulness of PMP in our environment.

I subscribed to SPMP now.  I’ll send my question to that email list as well.

Thanks,

Jeff

From: Greg Favor <gfavor@...>
Sent: Monday, August 22, 2022 11:53 AM
To: Jeff Scott <jeff.scott@...>
Cc: tech-privileged@...
Subject: [EXT] Re: [RISC-V] [tech-privileged] PMP shared permissions for S and U

Caution: EXT Email

PMP was architected to be a mechanism to protect M-mode software and resources from non-M-mode software and devices.  This complements the MMU which serves to protect and isolate between S-mode and U-mode (and between U-mode processes).  They are intended to be orthogonal and composable architectural components.

If you are looking for something that combines those two functions into one mechanism, then take a look at the new SPMP TG that is in the process of being formed as we speak.  I'm not certain, but I think that might be what you are looking for.  (Start of group meetings, etc. will be announced on relevant RVI email lists - including the Security HC's list since I believe this TG is being sponsored by that HC.)

Greg

On Mon, Aug 22, 2022 at 9:44 AM Jeff Scott <jeff.scott@...> wrote:

Hello,

I am curious why the PMP treats S and U mode accesses identically?  Is anyone aware of a standard extension that allows for different permissions for S and U?

Thanks,

Jeff

Seagate Internal

Re: PMP shared permissions for S and U

Jeff Scott

Hi Greg,

In our world we don’t have MMU, just PMP.  The inability to supply separate permissions to S and U limits the usefulness of PMP in our environment.

I subscribed to SPMP now.  I’ll send my question to that email list as well.

Thanks,

Jeff

From: Greg Favor <gfavor@...>
Sent: Monday, August 22, 2022 11:53 AM
To: Jeff Scott <jeff.scott@...>
Cc: tech-privileged@...
Subject: [EXT] Re: [RISC-V] [tech-privileged] PMP shared permissions for S and U

Caution: EXT Email

PMP was architected to be a mechanism to protect M-mode software and resources from non-M-mode software and devices.  This complements the MMU which serves to protect and isolate between S-mode and U-mode (and between U-mode processes).  They are intended to be orthogonal and composable architectural components.

If you are looking for something that combines those two functions into one mechanism, then take a look at the new SPMP TG that is in the process of being formed as we speak.  I'm not certain, but I think that might be what you are looking for.  (Start of group meetings, etc. will be announced on relevant RVI email lists - including the Security HC's list since I believe this TG is being sponsored by that HC.)

Greg

On Mon, Aug 22, 2022 at 9:44 AM Jeff Scott <jeff.scott@...> wrote:

Hello,

I am curious why the PMP treats S and U mode accesses identically?  Is anyone aware of a standard extension that allows for different permissions for S and U?

Thanks,

Jeff

Re: PMP shared permissions for S and U

Greg Favor

PMP was architected to be a mechanism to protect M-mode software and resources from non-M-mode software and devices.  This complements the MMU which serves to protect and isolate between S-mode and U-mode (and between U-mode processes).  They are intended to be orthogonal and composable architectural components.

If you are looking for something that combines those two functions into one mechanism, then take a look at the new SPMP TG that is in the process of being formed as we speak.  I'm not certain, but I think that might be what you are looking for.  (Start of group meetings, etc. will be announced on relevant RVI email lists - including the Security HC's list since I believe this TG is being sponsored by that HC.)

Greg

On Mon, Aug 22, 2022 at 9:44 AM Jeff Scott <jeff.scott@...> wrote:

Hello,

I am curious why the PMP treats S and U mode accesses identically?  Is anyone aware of a standard extension that allows for different permissions for S and U?

Thanks,

Jeff

PMP shared permissions for S and U

Jeff Scott

Hello,

I am curious why the PMP treats S and U mode accesses identically?  Is anyone aware of a standard extension that allows for different permissions for S and U?

Thanks,

Jeff

Re: Fast-track extension proposal V3 for "Sv32 Svpbmt"

Greg Favor

On Wed, Aug 17, 2022 at 5:19 AM Guo Ren <guoren@...> wrote:
Here is the third version of the proposal.

It seems like you have dropped a bunch of text from the prior versions - which is probably part of what leads to some of my questions below.

=======================  Supervisor Extension Additions  ========================
\subsection{Svpbmt32'' Extension for Page-Based Memory Types}
\label{sec:translation}

Svpbmt32 support is being added to allow the two highest bits of a PTE
to be used as PBMT instead of PA[33:32] for Sv32.

Maybe better to say "The Svpbmt32 extension allows the two ..."

The S-mode and G-stage address translation under this extension are

"are" -> "is"

controlled by the
menvcfg.PBMTE. The VS-stage address translation under this extension
is controlled by henvcfg.PBMTE and indirectly by menvcfg.PBMTE.

This needs to specify what "controlled" means.  There also is no explanation (here or below) of how menvcfg.PBMTE indirectly controls VS-stage address translation.

If this is all duplication of some of the currently defined *envcfg PBMTE functionality, then it is better to make clear that this is just reiterating a portion of the exact same functionality as currently exists for the *envcfg.PBMTE bits (and to refer to those current definitions for the details). Otherwise, by default, there is no clear understanding as to whether or not this extension specifies the exact same *envcfg.PBMTE behaviors as for the Svpbmt extension.

\begin{commentary}
For example, consider an RV32 system supporting Svpbmt32 and
Hypervisor Extension (Chapter~\ref{hypervisor}). When menvcfg.PBMTE=1,
Svpbmt32 is available for S-mode and G-stage address translation. When
henvcfg.PBMTE=1, Svpbmt32 is available for VS-mode address translation.

This is being presented as non-normative text, but this sounds like it is or needs to be normative text?  Or is this "non-normatively" just repeating existing defined functionality - in which case what extra value is this text providing past repeating existing arch definitions?

\end{commentary}

| 31  22 | 21  12 | 11        0 |
VPN[1]   VPN[0]   page offset    10       10         12

Sv32 physical address with Svpbmt:
| 31  22 | 21  12 | 11        0 |
PPN[1]   PPN[0]   page offset    10        10         12
Sv32 page table entry with Svpbmt:
| 31 30 | 29  20 | 19  10 | 9             8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
PBMT   PPN[1]   PPN[0]   reserved for SW   D   A   G   U   X   W   R   V
========================================================================

There is no statement anywhere in this spec that says what exactly is the PBMT field.  I know you intend for it to have the exact same definition as the currently defined PBMT field, but this should be stated explicitly (to remove any possible ambiguity around whether PBMT encodings and/or meanings may be a little different in RV32 versus RV64).

Fast-track extension proposal V3 for "Sv32 Svpbmt"

Guo Ren

Hi all,

Here is the third version of the proposal.

V2: https://lists.riscv.org/g/tech-privileged/message/1079
V1: https://lists.riscv.org/g/tech-privileged/message/1051

This posting to this email list starts an initial review period for people to provide feedback, questions, comments, etc.

Thanks,
Guo Ren

========================================================================

=======================  Supervisor Extension Additions  ========================
\subsection{Svpbmt32'' Extension for Page-Based Memory Types}
\label{sec:translation}

Svpbmt32 support is being added to allow the two highest bits of a PTE
to be used as PBMT instead of PA[33:32] for Sv32. The S-mode and
G-stage address translation under this extension are controlled by the
menvcfg.PBMTE. The VS-stage address translation under this extension
is controlled by henvcfg.PBMTE and indirectly by menvcfg.PBMTE.

\begin{commentary}
For example, consider an RV32 system supporting Svpbmt32 and
Hypervisor Extension (Chapter~\ref{hypervisor}). When menvcfg.PBMTE=1,
Svpbmt32 is available for S-mode and G-stage address translation. When
henvcfg.PBMTE=1, Svpbmt32 is available for VS-mode address translation.
\end{commentary}

| 31  22 | 21  12 | 11        0 |
VPN[1]   VPN[0]   page offset    10       10         12

Sv32 physical address with Svpbmt:
| 31  22 | 21  12 | 11        0 |
PPN[1]   PPN[0]   page offset    10        10         12
Sv32 page table entry with Svpbmt:
| 31 30 | 29  20 | 19  10 | 9             8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
PBMT   PPN[1]   PPN[0]   reserved for SW   D   A   G   U   X   W   R   V
========================================================================

Re: Are pages allowed to cross PMA regions?

Andres Amaya Garcia

Thank you all for the valuable input!

In summary, it is possible to have virtual memory pages that straddle multiple PMA and PMP regions. There are simplifications or implementation decisions that can be made to deal with this situation: Limiting PMA/PMP regions to be >= 4KB, caching in TLB, attribute caches, etc.

However, the rules regarding some of these cases (misaligned accesses across regions, straddling, etc) appear to be rather loose in RISC V (see Allen Baum's message). Is there any ongoing work/plans to revisit the subject and perhaps clarify some of it in the Privileged Specification? If not, is it worth tracking this somewhere? Perhaps creating a GitHub issue? (P.S. I am new to the RISC V community, so don't know how to go about it if there is interest in the subject)

Once again, thanks for the help!

Re: Are pages allowed to cross PMA regions?

Tariq Kurd

>For TLBs, the important simplification is PMP/PMA aren't <4KiB in
>granularity, as then existing TLB entires can be used to cache
>permissions.

Yes - this makes a lot of sense. What about the case where the software updates the PMP entries though? This would then require an sfence.vma to clear the micro-TLBs as the PMP permissions may be out-of-date.
The architecture doesn't require this, so can we add this requirement? How is this typically done?

I've found this text now, so please disregard my previous email:

"Hence, when the PMP settings are modified, M-mode software must synchronize the PMP settings with the virtual memory system and any PMP or address-translation caches. This is accomplished by executing an SFENCE.VMA instruction with rs1=x0 and rs2=x0, after the PMP CSRs are written."

Thanks

Tariq

On Tue, 16 Aug 2022 at 13:10, Tariq Kurd via lists.riscv.org <tariq.kurd=codasip.com@...> wrote:
>For TLBs, the important simplification is PMP/PMA aren't <4KiB in
>granularity, as then existing TLB entires can be used to cache
>permissions.

Yes - this makes a lot of sense. What about the case where the software updates the PMP entries though? This would then require an sfence.vma to clear the micro-TLBs as the PMP permissions may be out-of-date.
The architecture doesn't require this, so can we add this requirement? How is this typically done?

Tariq

On Tue, 16 Aug 2022 at 00:41, <krste@...> wrote:

>>>>> On Mon, 15 Aug 2022 10:14:59 +0200, Tariq Kurd <tariq.kurd@...> said:
|| In particular, a portion of a misaligned store that passes the PMP check may become visible, even if another portion fails the PMP check
| I had no idea this was in the spec - so I'm glad you added that comment Allen.
| yes - between MMU pages, PMP regions and PMA regions it's all pretty complex.
| In systems with an MMU do people typically also implement the PMP? And if so why?

Yes.

To contain < M-mode code running on the hart (including implicit
references such as page-table walkers).

M-mode+PMP can provide a monitor that isolates and multiplexes
multiple S-mode stacks, as in Keystone enclave work.

| As the granularity of PMA and PMP regions are implementation defined - I'm wondering if a nice simplification would be to specify them
| both with 64-byte granularity, and 64-byte alignment to match the cache-block size for the CMOs. At least then the PMAs can't cross the
| boundary of a TLB page.

For TLBs, the important simplification is PMP/PMA aren't <4KiB in
granularity, as then existing TLB entires can be used to cache
permissions.  Having PMP/PMA granules larger than a page is fine, as
these would only be checked on a TLB miss.  If < page, then easiest
solution is to not cache these regions in TLB, forcing a TLB
miss+check on every access, for example.  Of course, other alternative
microarch schemes are possible.

Krste

| Tariq

| On Sat, 13 Aug 2022 at 09:02, Allen Baum <allen.baum@...> wrote:

|     There are at least 3 potential boundaries: MMU pages, PMP regions, and PMA regions.
|     All bytes of an access must be contained within a single PMP region. The operative word there is "access", because a misaligned load
|     /store may be (and is typically) split into two separate accesses.
|     Ordering of those accesses is not spec'ed, so it's possible to get various exceptions with either the lower or upper part of the load
|     /store, (or both).
|     When that happens on a store, the trap may occur after either the low hor high alf has been written. (non-determinsitically even, so
|     it's a bear to test).

|     I don't know if that specific rule applies to PMA's or MMU page crossings,
|     but if a misaligned access is split into two (or more, eventually) accesses that don't cross a boundary, then it's moot;
|     you treat them individually. .That split is hard to avoid

|     But an implementation isn't required to split a misaligned address, and outside of the PMP spec, I don't think that case is mentioned
|     An implementation is free to always trap on a misaligned access and perform it byte-by-byte (while ensuring no interrupt can occur in
|     the middle, lest someone see a stale value)
|     I believe it is also legal to handle it entirely in HW excecpt when it crosses a various boundaries (e.g. cacheline, page, etc), and
|     signal a misalign exception if it does.
|     Or even signal a misalign exception depending on the phase of the moon (or other non-architecural state).

|     Personally, I'd be really happy if we could tighten those rules up a lot.

|     On Fri, Aug 12, 2022 at 2:28 PM Greg Favor <gfavor@...> wrote:

|             It would be nice if it was architecturally defined/permitted for such straddling accesses to be performed a byte at a time.

|         That could be ok for accesses to idempotent memory, but would likely be problematic for a non-idempotent location (e.g. a
|         memory-mapped I/O register), and byte accesses to a word MMIO register might not even be allowed by the PMAs for that location.

| --

| Tariq Kurd | Lead IP Architect | Codasip UK Design Centre | www.codasip.com

--

Tariq Kurd | Lead IP Architect | Codasip UK Design Centre | www.codasip.com

--

Tariq Kurd | Lead IP Architect | Codasip UK Design Centre | www.codasip.com

Re: Are pages allowed to cross PMA regions?

Tariq Kurd

>For TLBs, the important simplification is PMP/PMA aren't <4KiB in
>granularity, as then existing TLB entires can be used to cache
>permissions.

Yes - this makes a lot of sense. What about the case where the software updates the PMP entries though? This would then require an sfence.vma to clear the micro-TLBs as the PMP permissions may be out-of-date.
The architecture doesn't require this, so can we add this requirement? How is this typically done?

Tariq

On Tue, 16 Aug 2022 at 00:41, <krste@...> wrote:

>>>>> On Mon, 15 Aug 2022 10:14:59 +0200, Tariq Kurd <tariq.kurd@...> said:
|| In particular, a portion of a misaligned store that passes the PMP check may become visible, even if another portion fails the PMP check
| I had no idea this was in the spec - so I'm glad you added that comment Allen.
| yes - between MMU pages, PMP regions and PMA regions it's all pretty complex.
| In systems with an MMU do people typically also implement the PMP? And if so why?

Yes.

To contain < M-mode code running on the hart (including implicit
references such as page-table walkers).

M-mode+PMP can provide a monitor that isolates and multiplexes
multiple S-mode stacks, as in Keystone enclave work.

| As the granularity of PMA and PMP regions are implementation defined - I'm wondering if a nice simplification would be to specify them
| both with 64-byte granularity, and 64-byte alignment to match the cache-block size for the CMOs. At least then the PMAs can't cross the
| boundary of a TLB page.

For TLBs, the important simplification is PMP/PMA aren't <4KiB in
granularity, as then existing TLB entires can be used to cache
permissions.  Having PMP/PMA granules larger than a page is fine, as
these would only be checked on a TLB miss.  If < page, then easiest
solution is to not cache these regions in TLB, forcing a TLB
miss+check on every access, for example.  Of course, other alternative
microarch schemes are possible.

Krste

| Tariq

| On Sat, 13 Aug 2022 at 09:02, Allen Baum <allen.baum@...> wrote:

|     There are at least 3 potential boundaries: MMU pages, PMP regions, and PMA regions.
|     All bytes of an access must be contained within a single PMP region. The operative word there is "access", because a misaligned load
|     /store may be (and is typically) split into two separate accesses.
|     Ordering of those accesses is not spec'ed, so it's possible to get various exceptions with either the lower or upper part of the load
|     /store, (or both).
|     When that happens on a store, the trap may occur after either the low hor high alf has been written. (non-determinsitically even, so
|     it's a bear to test).

|     I don't know if that specific rule applies to PMA's or MMU page crossings,
|     but if a misaligned access is split into two (or more, eventually) accesses that don't cross a boundary, then it's moot;
|     you treat them individually. .That split is hard to avoid

|     But an implementation isn't required to split a misaligned address, and outside of the PMP spec, I don't think that case is mentioned
|     An implementation is free to always trap on a misaligned access and perform it byte-by-byte (while ensuring no interrupt can occur in
|     the middle, lest someone see a stale value)
|     I believe it is also legal to handle it entirely in HW excecpt when it crosses a various boundaries (e.g. cacheline, page, etc), and
|     signal a misalign exception if it does.
|     Or even signal a misalign exception depending on the phase of the moon (or other non-architecural state).

|     Personally, I'd be really happy if we could tighten those rules up a lot.

|     On Fri, Aug 12, 2022 at 2:28 PM Greg Favor <gfavor@...> wrote:

|             It would be nice if it was architecturally defined/permitted for such straddling accesses to be performed a byte at a time.

|         That could be ok for accesses to idempotent memory, but would likely be problematic for a non-idempotent location (e.g. a
|         memory-mapped I/O register), and byte accesses to a word MMIO register might not even be allowed by the PMAs for that location.

| --

| Tariq Kurd | Lead IP Architect | Codasip UK Design Centre | www.codasip.com

--

Tariq Kurd | Lead IP Architect | Codasip UK Design Centre | www.codasip.com

Re: Are pages allowed to cross PMA regions?

Krste Asanovic

On Mon, 15 Aug 2022 10:14:59 +0200, Tariq Kurd <tariq.kurd@...> said:
|| In particular, a portion of a misaligned store that passes the PMP check may become visible, even if another portion fails the PMP check
| I had no idea this was in the spec - so I'm glad you added that comment Allen.
| yes - between MMU pages, PMP regions and PMA regions it's all pretty complex.
| In systems with an MMU do people typically also implement the PMP? And if so why?

Yes.

To contain < M-mode code running on the hart (including implicit
references such as page-table walkers).

M-mode+PMP can provide a monitor that isolates and multiplexes
multiple S-mode stacks, as in Keystone enclave work.

| As the granularity of PMA and PMP regions are implementation defined - I'm wondering if a nice simplification would be to specify them
| both with 64-byte granularity, and 64-byte alignment to match the cache-block size for the CMOs. At least then the PMAs can't cross the
| boundary of a TLB page.

For TLBs, the important simplification is PMP/PMA aren't <4KiB in
granularity, as then existing TLB entires can be used to cache
permissions. Having PMP/PMA granules larger than a page is fine, as
these would only be checked on a TLB miss. If < page, then easiest
solution is to not cache these regions in TLB, forcing a TLB
miss+check on every access, for example. Of course, other alternative
microarch schemes are possible.

Krste

| Tariq

| On Sat, 13 Aug 2022 at 09:02, Allen Baum <allen.baum@...> wrote:

| There are at least 3 potential boundaries: MMU pages, PMP regions, and PMA regions.
| All bytes of an access must be contained within a single PMP region. The operative word there is "access", because a misaligned load
| /store may be (and is typically) split into two separate accesses.
| Ordering of those accesses is not spec'ed, so it's possible to get various exceptions with either the lower or upper part of the load
| /store, (or both).
| When that happens on a store, the trap may occur after either the low hor high alf has been written. (non-determinsitically even, so
| it's a bear to test).

| I don't know if that specific rule applies to PMA's or MMU page crossings,
| but if a misaligned access is split into two (or more, eventually) accesses that don't cross a boundary, then it's moot;
| you treat them individually. .That split is hard to avoid

| But an implementation isn't required to split a misaligned address, and outside of the PMP spec, I don't think that case is mentioned
| An implementation is free to always trap on a misaligned access and perform it byte-by-byte (while ensuring no interrupt can occur in
| the middle, lest someone see a stale value)
| I believe it is also legal to handle it entirely in HW excecpt when it crosses a various boundaries (e.g. cacheline, page, etc), and
| signal a misalign exception if it does.
| Or even signal a misalign exception depending on the phase of the moon (or other non-architecural state).

| Personally, I'd be really happy if we could tighten those rules up a lot.

| On Fri, Aug 12, 2022 at 2:28 PM Greg Favor <gfavor@...> wrote:

| It would be nice if it was architecturally defined/permitted for such straddling accesses to be performed a byte at a time.

| That could be ok for accesses to idempotent memory, but would likely be problematic for a non-idempotent location (e.g. a
| memory-mapped I/O register), and byte accesses to a word MMIO register might not even be allowed by the PMAs for that location.

|

| --

| Tariq Kurd | Lead IP Architect | Codasip UK Design Centre | www.codasip.com

Re: Are pages allowed to cross PMA regions?

Tariq Kurd

>In particular, a portion of a misaligned store that passes the PMP check may become visible, even if another portion fails the PMP check

I had no idea this was in the spec - so I'm glad you added that comment Allen.

yes - between MMU pages, PMP regions and PMA regions it's all pretty complex.

In systems with an MMU do people typically also implement the PMP? And if so why?

As the granularity of PMA and PMP regions are implementation defined - I'm wondering if a nice simplification would be to specify them both with 64-byte granularity, and 64-byte alignment to match the cache-block size for the CMOs. At least then the PMAs can't cross the boundary of a TLB page.

Tariq

On Sat, 13 Aug 2022 at 09:02, Allen Baum <allen.baum@...> wrote:
There are at least 3 potential boundaries: MMU pages, PMP regions, and PMA regions.
All bytes of an access must be contained within a single PMP region. The operative word there is "access", because a misaligned load /store may be (and is typically) split into two separate accesses.
Ordering of those accesses is not spec'ed, so it's possible to get various exceptions with either the lower or upper part of the load/store, (or both).
When that happens on a store, the trap may occur after either the low hor high alf has been written. (non-determinsitically even, so it's a bear to test).

I don't know if that specific rule applies to PMA's or MMU page crossings,
but if a misaligned access is split into two (or more, eventually) accesses that don't cross a boundary, then it's moot;
you treat them individually. .That split is hard to avoid

But an implementation isn't required to split a misaligned address, and outside of the PMP spec, I don't think that case is mentioned
An implementation is free to always trap on a misaligned access and perform it byte-by-byte (while ensuring no interrupt can occur in the middle, lest someone see a stale value)
I believe it is also legal to handle it entirely in HW excecpt when it crosses a various boundaries (e.g. cacheline, page, etc), and  signal a misalign exception if it does.
Or even signal a misalign exception depending on the phase of the moon (or other non-architecural state).

Personally, I'd be really happy if we could tighten those rules up a lot.

On Fri, Aug 12, 2022 at 2:28 PM Greg Favor <gfavor@...> wrote:
It would be nice if it was architecturally defined/permitted for such straddling accesses to be performed a byte at a time.

That could be ok for accesses to idempotent memory, but would likely be problematic for a non-idempotent location (e.g. a memory-mapped I/O register), and byte accesses to a word MMIO register might not even be allowed by the PMAs for that location.

--

Tariq Kurd | Lead IP Architect | Codasip UK Design Centre | www.codasip.com

Re: Are pages allowed to cross PMA regions?

Allen Baum

There are at least 3 potential boundaries: MMU pages, PMP regions, and PMA regions.
All bytes of an access must be contained within a single PMP region. The operative word there is "access", because a misaligned load /store may be (and is typically) split into two separate accesses.
Ordering of those accesses is not spec'ed, so it's possible to get various exceptions with either the lower or upper part of the load/store, (or both).
When that happens on a store, the trap may occur after either the low hor high alf has been written. (non-determinsitically even, so it's a bear to test).

I don't know if that specific rule applies to PMA's or MMU page crossings,
but if a misaligned access is split into two (or more, eventually) accesses that don't cross a boundary, then it's moot;
you treat them individually. .That split is hard to avoid

But an implementation isn't required to split a misaligned address, and outside of the PMP spec, I don't think that case is mentioned
An implementation is free to always trap on a misaligned access and perform it byte-by-byte (while ensuring no interrupt can occur in the middle, lest someone see a stale value)
I believe it is also legal to handle it entirely in HW excecpt when it crosses a various boundaries (e.g. cacheline, page, etc), and  signal a misalign exception if it does.
Or even signal a misalign exception depending on the phase of the moon (or other non-architecural state).

Personally, I'd be really happy if we could tighten those rules up a lot.

On Fri, Aug 12, 2022 at 2:28 PM Greg Favor <gfavor@...> wrote:
It would be nice if it was architecturally defined/permitted for such straddling accesses to be performed a byte at a time.

That could be ok for accesses to idempotent memory, but would likely be problematic for a non-idempotent location (e.g. a memory-mapped I/O register), and byte accesses to a word MMIO register might not even be allowed by the PMAs for that location.

Re: [RISC-V] [tech-virt-mem] Help needed on physical address issues

Allen Baum

The ACTs will test that bits above b38 match b38 if SV39 is enabled
Likewise for SV48 and SV 57.
There is no SV64 yet, so any VA bits above b56 that don't match b56 are expected to always trap.

If a PMA doesn't do that, it will fail ACTs

Greg: are you saying the canonicity boundary of VAs is always bit 56? Because that's not my reading.
For PAs, on the other hand, the PA width is not an architectural constant, its an inplmentation choice.
ACTs expect the DUT config YAML to specify how wide the PA is, and any non-zero bits above that are expected to trap.

@Bill.McSpadden: this is another thing that Sail needs to be aware of (it's in my list of unnamed architectural options).

On Thu, Aug 11, 2022 at 8:30 PM Ke Chai <ck.rickytino@...> wrote:
Thanks again to everyone! These answers make sense to me.

Re: Are pages allowed to cross PMA regions?

Greg Favor

It would be nice if it was architecturally defined/permitted for such straddling accesses to be performed a byte at a time.

That could be ok for accesses to idempotent memory, but would likely be problematic for a non-idempotent location (e.g. a memory-mapped I/O register), and byte accesses to a word MMIO register might not even be allowed by the PMAs for that location.

Re: Are pages allowed to cross PMA regions?

Andy Glew (Gmail) <andyglew@...>

But if that access straddles multiple pages and/or PMA regions, then each byte of the access must pass its MMU
| and PMA checks for the whole access to be allowed.

It would be nice if it was architecturally defined/permitted for such straddling accesses to be performed a byte at a time.   That makes the trap and emulate handler easier to code.

If not a byte at a time, then whatever is the largest  possible NAPOT size that the  access can be decomposed into.

But anything coarser grained than a byte,  or whatever the finest granule of PMA is,  either requires the trap and emulate handler to probe permissions to guarantee that  the transactions it emits are not themselves straddling,  or you have to be ready to handle nested such trap and emulations.   Or at least tail recursive.

__________________________________
| www.emclient.com <https://www.emclient.com/get>

------ Original Message ------
From "Krste Asanovic" <krste@...>
To "Greg Favor" <gfavor@...>
Date 8/12/2022 12:57:55
Subject Re: [RISC-V] [tech-privileged] Are pages allowed to cross PMA regions?

On Fri, 12 Aug 2022 10:35:15 -0700, "Greg Favor" <gfavor@...> said:

| Can a virtual paged be mapped to addresses that cross PMA regions? For example, is it acceptable to map a 1GB page such that half its physical addresses have the (e.g.) cacheable
| attribute but the other half of physical addresses are uncacheable? You could think about this with every attribute: vacant, idempotent, etc.
| This sounds odd, but the ISA does not explicitly allow or forbid it. Is it something that must to be supported? If so, are there example use-cases?

| The PMA architecture allows a lot of implementation flexibility - including for example having small 4B regions.  In that example one could easily have one 4KB page overlap multiple
| PMA regions.

| Conversely, in a typical OS-A class system using demand-paged virtual memory, the implementor will probably choose to have a minimum 4KB granularity to PMA regions.  Although this
| still allows 2MB, 1GB, and 512GB pages to overlap multiple PMA regions.  (Which in typical TLB implementations leads to what some would call "atomization" of page mappings into
| smaller TLB entry mappings.)

Even in a RISC-V OS-A platform, the implementor might be stuck with
using IP peripherals where PMAs vary at the sub-page granularity.

| In short, if a page overlaps multiple regions, then that needs to be handled properly.  Typically any given load/store/ifetch/implicit access that is being checked will fall in one
| page and in one PMA region - in which case the behavior is obvious.  But if that access straddles multiple pages and/or PMA regions, then each byte of the access must pass its MMU
| and PMA checks for the whole access to be allowed.

Yes.

We have some text for this in some places, but these concepts should
really be factored out somewhere central.

Krste

|

Re: Are pages allowed to cross PMA regions?

Krste Asanovic

On Fri, 12 Aug 2022 10:35:15 -0700, "Greg Favor" <gfavor@...> said:
| Can a virtual paged be mapped to addresses that cross PMA regions? For example, is it acceptable to map a 1GB page such that half its physical addresses have the (e.g.) cacheable
| attribute but the other half of physical addresses are uncacheable? You could think about this with every attribute: vacant, idempotent, etc.

| This sounds odd, but the ISA does not explicitly allow or forbid it. Is it something that must to be supported? If so, are there example use-cases?

| The PMA architecture allows a lot of implementation flexibility - including for example having small 4B regions.  In that example one could easily have one 4KB page overlap multiple
| PMA regions.

| Conversely, in a typical OS-A class system using demand-paged virtual memory, the implementor will probably choose to have a minimum 4KB granularity to PMA regions.  Although this
| still allows 2MB, 1GB, and 512GB pages to overlap multiple PMA regions.  (Which in typical TLB implementations leads to what some would call "atomization" of page mappings into
| smaller TLB entry mappings.)

Even in a RISC-V OS-A platform, the implementor might be stuck with
using IP peripherals where PMAs vary at the sub-page granularity.

| In short, if a page overlaps multiple regions, then that needs to be handled properly.  Typically any given load/store/ifetch/implicit access that is being checked will fall in one
| page and in one PMA region - in which case the behavior is obvious.  But if that access straddles multiple pages and/or PMA regions, then each byte of the access must pass its MMU
| and PMA checks for the whole access to be allowed.

Yes.

We have some text for this in some places, but these concepts should
really be factored out somewhere central.

Krste

|

Re: Are pages allowed to cross PMA regions?

Greg Favor

Can a virtual paged be mapped to addresses that cross PMA regions? For example, is it acceptable to map a 1GB page such that half its physical addresses have the (e.g.) cacheable attribute but the other half of physical addresses are uncacheable? You could think about this with every attribute: vacant, idempotent, etc.

This sounds odd, but the ISA does not explicitly allow or forbid it. Is it something that must to be supported? If so, are there example use-cases?

The PMA architecture allows a lot of implementation flexibility - including for example having small 4B regions.  In that example one could easily have one 4KB page overlap multiple PMA regions.

Conversely, in a typical OS-A class system using demand-paged virtual memory, the implementor will probably choose to have a minimum 4KB granularity to PMA regions.  Although this still allows 2MB, 1GB, and 512GB pages to overlap multiple PMA regions.  (Which in typical TLB implementations leads to what some would call "atomization" of page mappings into smaller TLB entry mappings.)

In short, if a page overlaps multiple regions, then that needs to be handled properly.  Typically any given load/store/ifetch/implicit access that is being checked will fall in one page and in one PMA region - in which case the behavior is obvious.  But if that access straddles multiple pages and/or PMA regions, then each byte of the access must pass its MMU and PMA checks for the whole access to be allowed.

 1 - 20 of 1130