Date   

Re: rv57k virtual address space

Phil McCoy
 

This probably should be discussed on the tech-virt-mem list.

Cheers,
Phil


Re: rv57k virtual address space

swallach
 

not really 
if necessary we can define 2 64_bit registers.  

i wanted to simply define what machine state is needed

that is a good catch
thank you


On Jan 29, 2021, at 10:51 AM, Jonathan Behrens <behrensj@...> wrote:


Your SATPU and SATPK registers seem to each contain: 64-bits of PPN, 32-bits of ASID, 4-bits for MODE and 28-bits reserved. But that adds up to 128-bits which is double the size of CSRs on 64-bit RISC-V processors. Are you imagining that RV57K would require a 128-bit processor, or am I misunderstanding something?

Jonathan

On Fri, Jan 29, 2021 at 10:31 AM swallach via lists.riscv.org <steven.wallach=bsc.es@...> wrote:
attached is a proposal for the definition of RV57K.  RV57K is an extension to  RV57.  the main extension is to incorporate two HARDWARE registers.  These registers are used to partition user and kernel address spaces.  Additionally the higher order address bit signifies user/kernel space.  This definition was discussed at length in the sigHPC group. applications that are hosted on these class of systems,  especially going forward,   expect secure environment with peta bytes of data.  of course there are 10,000 (or more) compute nodes. The basis for the proposed RV57K is KASLR effort, Thus there are Linux ports that adhere to this definition.  Most of the feature set proposed,  is already available in ARM based implementations. RV57K DOES NOT necessarily replace RV57.  

as noted,  this is just the beginning.  once approved,  we will move to defining RV64.  Currently,  to the best of my knowledge,  there is no definition of a RV64 like addressing structure, as least for the IA-64. within the sigHPC group we are anticipating the requirements of EXASCALE COMPUTING.  One of the most obvious requirements will be an Exascale of distributed main memory.



WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


Re: rv57k virtual address space

Jonathan Behrens <behrensj@...>
 

Your SATPU and SATPK registers seem to each contain: 64-bits of PPN, 32-bits of ASID, 4-bits for MODE and 28-bits reserved. But that adds up to 128-bits which is double the size of CSRs on 64-bit RISC-V processors. Are you imagining that RV57K would require a 128-bit processor, or am I misunderstanding something?

Jonathan


On Fri, Jan 29, 2021 at 10:31 AM swallach via lists.riscv.org <steven.wallach=bsc.es@...> wrote:
attached is a proposal for the definition of RV57K.  RV57K is an extension to  RV57.  the main extension is to incorporate two HARDWARE registers.  These registers are used to partition user and kernel address spaces.  Additionally the higher order address bit signifies user/kernel space.  This definition was discussed at length in the sigHPC group. applications that are hosted on these class of systems,  especially going forward,   expect secure environment with peta bytes of data.  of course there are 10,000 (or more) compute nodes. The basis for the proposed RV57K is KASLR effort, Thus there are Linux ports that adhere to this definition.  Most of the feature set proposed,  is already available in ARM based implementations. RV57K DOES NOT necessarily replace RV57.  

as noted,  this is just the beginning.  once approved,  we will move to defining RV64.  Currently,  to the best of my knowledge,  there is no definition of a RV64 like addressing structure, as least for the IA-64. within the sigHPC group we are anticipating the requirements of EXASCALE COMPUTING.  One of the most obvious requirements will be an Exascale of distributed main memory.


rv57k virtual address space

swallach
 

attached is a proposal for the definition of RV57K.  RV57K is an extension to  RV57.  the main extension is to incorporate two HARDWARE registers.  These registers are used to partition user and kernel address spaces.  Additionally the higher order address bit signifies user/kernel space.  This definition was discussed at length in the sigHPC group. applications that are hosted on these class of systems,  especially going forward,   expect secure environment with peta bytes of data.  of course there are 10,000 (or more) compute nodes. The basis for the proposed RV57K is KASLR effort, Thus there are Linux ports that adhere to this definition.  Most of the feature set proposed,  is already available in ARM based implementations. RV57K DOES NOT necessarily replace RV57.  

as noted,  this is just the beginning.  once approved,  we will move to defining RV64.  Currently,  to the best of my knowledge,  there is no definition of a RV64 like addressing structure, as least for the IA-64. within the sigHPC group we are anticipating the requirements of EXASCALE COMPUTING.  One of the most obvious requirements will be an Exascale of distributed main memory.


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Allen Baum
 

Exactly. - I was just pointing out that this is a little different than the usual handler, but still needs to do the same saving - just a bit or non-normative text..
And you need to save scratch as well.

The watchdog timers I'm familiar with have code that explicitly resets the watchdog so it never expires - unless something bad happens.
That's probably a different use case, so either way is valid.

On Thu, Jan 28, 2021 at 5:46 PM Paul Donahue <pdonahue@...> wrote:
This is no more a problem for RNMI than anything else.  The S mode handler must save (push) scause/sepc before doing anything that could cause an exception.  Then it can handle nested exceptions.  Same for M mode.  The problem this proposal is solving is that NMI can happen before M mode has a chance to do this, so this proposal has a special cause/epc for that purpose.  The first thing the NMI handler must do is save mcause/mepc before doing anything that could cause an exception (and NMI cannot happen before this is complete because NMIs are disabled in this handler).


Thanks,

-Paul



On Thu, Jan 28, 2021 at 4:39 PM Allen Baum <allen.baum@...> wrote:
My concern is that if an RNMI occurs during an exception handler, and the RNMI handler encounters an exception, the exception return point. cause, and prev. priv state is lost - that's fatal.
It's also no different than an exception handler encountering an exception, but it should be noted as something to take into account because an RNMI taken during an interrupt handler doesn't have that problem.

On Thu, Jan 28, 2021 at 4:14 PM Paul Donahue <pdonahue@...> wrote:
Thank you for pointing out that there are two RNMI vectors: one for NMI and one for exceptions.  As you surmised, I missed that but I think that addresses my concern.


-Paul

On Thu, Jan 28, 2021 at 2:34 PM John Hauser <jh.riscv@...> wrote:
1.
I think some of the questions and discussion concerning this proposal
have arisen because the proposal is a little too terse.  It would
be helpful to have more about the specific use cases that the RNMI
hardware is intended to support, and not support, and why, including
about nested NMIs (whether supported or not).  There seems to be an
assumption that the author and readers are already in agreement about
these matters, and that's much too optimistic, I think.

Some of the feedback has concerned the intersection of NMIs with
synchronous exceptions, such as what happens if a synchronous exception
occurs while in an NMI handler.  The proposal has this sentence:

    RNMI also has an associated exception trap handler address, which
    is implementation defined.

and then:

    If the hart encounters an exception while the `rnmie` bit is
    clear, the exception state is written to `mepc` and `mcause`,
    `mstatus.mpp` is set to M-mode, and the hart jumps to the RNMI
    exception handler address.

But that all went by too quickly for some reviewers, it seems.  (It may
also not be adequate in conjuction with nested NMIs.)

Not every reader will immediately intuit the clear distinction you
intend between _interrupt_ and _exception_, especially as even the
RISC-V Privileged Architecture sometimes muddies the two terms.
For example, note that an *interrupt trap* saves the interrupted
instruction address to one of the *Exception* Program Counter CSRs
such as mepc or mnepc, implying that the term _exception_ includes
interrupts.

2.
I urge that we not commit to CSR addresses 0x350-0x353 for the RNMI
CSRs.

As part of the planning for a possible future extension to improve
support for nested hypervisors, a pattern has been tentatively
established for M, S, U, VS, and V2S variants of CSRs, such as mstatus,
sstatus, ustatus, vsstatus, and a future v2sstatus.  The M-level
CSR addresses in the range 0x340-0x39F would be among those that can
support the full set of M, S, U, VS, and V2S variants of the same CSR.
But RNMI would never use the VS and V2S slots, and probably not the
U slot either.  It would be better to relocate the RNMI CSRs to other,
more constrained addresses, and leave the more flexible locations open
for CSRs that may need them.

To best manage the remaining CSR address space, I propose that the main
responsibility for assigning CSR addresses be given to the new Opcode
and Consistency Review group.

3.
Greg Favor wrote:
> - In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit
>   (for when the H extension is implemented and enabled)?

Krste:
> I'll let hypervisor authors address this.

Greg is correct.  When the hypervisor extension is implemented, the
existing 2-bit mstatus.MPP is extended with another bit, mstatus.MPV.
The saved operating mode is fully represented by 3 bits, MPV and MPP
together.

I would suggest allocating mnstatus bit 13 for the MPV-like bit.

(I'd expect these two fields in mnstatus to be named either MPV and
MPP, or MNPV and MNPP.)

In anticipation of a possible future extension for nested hypervisors,
you should also keep bit 14 free for another bit (MPV2).

4.
Krste wrote:
> rnmie should be settable but not clearable in M-mode to support
> nesting NMIs.

I think, when you lay out exactly, step by step, how nested NMIs would
work, you'll discover that's inadequate.  (Or, if I'm wrong, then you
have different plans for how it will work than I can guess, so, again,
it would be good to have this in writing as part of the proposal.)

    - John Hauser






Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Paul Donahue
 

This is no more a problem for RNMI than anything else.  The S mode handler must save (push) scause/sepc before doing anything that could cause an exception.  Then it can handle nested exceptions.  Same for M mode.  The problem this proposal is solving is that NMI can happen before M mode has a chance to do this, so this proposal has a special cause/epc for that purpose.  The first thing the NMI handler must do is save mcause/mepc before doing anything that could cause an exception (and NMI cannot happen before this is complete because NMIs are disabled in this handler).


Thanks,

-Paul



On Thu, Jan 28, 2021 at 4:39 PM Allen Baum <allen.baum@...> wrote:
My concern is that if an RNMI occurs during an exception handler, and the RNMI handler encounters an exception, the exception return point. cause, and prev. priv state is lost - that's fatal.
It's also no different than an exception handler encountering an exception, but it should be noted as something to take into account because an RNMI taken during an interrupt handler doesn't have that problem.

On Thu, Jan 28, 2021 at 4:14 PM Paul Donahue <pdonahue@...> wrote:
Thank you for pointing out that there are two RNMI vectors: one for NMI and one for exceptions.  As you surmised, I missed that but I think that addresses my concern.


-Paul

On Thu, Jan 28, 2021 at 2:34 PM John Hauser <jh.riscv@...> wrote:
1.
I think some of the questions and discussion concerning this proposal
have arisen because the proposal is a little too terse.  It would
be helpful to have more about the specific use cases that the RNMI
hardware is intended to support, and not support, and why, including
about nested NMIs (whether supported or not).  There seems to be an
assumption that the author and readers are already in agreement about
these matters, and that's much too optimistic, I think.

Some of the feedback has concerned the intersection of NMIs with
synchronous exceptions, such as what happens if a synchronous exception
occurs while in an NMI handler.  The proposal has this sentence:

    RNMI also has an associated exception trap handler address, which
    is implementation defined.

and then:

    If the hart encounters an exception while the `rnmie` bit is
    clear, the exception state is written to `mepc` and `mcause`,
    `mstatus.mpp` is set to M-mode, and the hart jumps to the RNMI
    exception handler address.

But that all went by too quickly for some reviewers, it seems.  (It may
also not be adequate in conjuction with nested NMIs.)

Not every reader will immediately intuit the clear distinction you
intend between _interrupt_ and _exception_, especially as even the
RISC-V Privileged Architecture sometimes muddies the two terms.
For example, note that an *interrupt trap* saves the interrupted
instruction address to one of the *Exception* Program Counter CSRs
such as mepc or mnepc, implying that the term _exception_ includes
interrupts.

2.
I urge that we not commit to CSR addresses 0x350-0x353 for the RNMI
CSRs.

As part of the planning for a possible future extension to improve
support for nested hypervisors, a pattern has been tentatively
established for M, S, U, VS, and V2S variants of CSRs, such as mstatus,
sstatus, ustatus, vsstatus, and a future v2sstatus.  The M-level
CSR addresses in the range 0x340-0x39F would be among those that can
support the full set of M, S, U, VS, and V2S variants of the same CSR.
But RNMI would never use the VS and V2S slots, and probably not the
U slot either.  It would be better to relocate the RNMI CSRs to other,
more constrained addresses, and leave the more flexible locations open
for CSRs that may need them.

To best manage the remaining CSR address space, I propose that the main
responsibility for assigning CSR addresses be given to the new Opcode
and Consistency Review group.

3.
Greg Favor wrote:
> - In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit
>   (for when the H extension is implemented and enabled)?

Krste:
> I'll let hypervisor authors address this.

Greg is correct.  When the hypervisor extension is implemented, the
existing 2-bit mstatus.MPP is extended with another bit, mstatus.MPV.
The saved operating mode is fully represented by 3 bits, MPV and MPP
together.

I would suggest allocating mnstatus bit 13 for the MPV-like bit.

(I'd expect these two fields in mnstatus to be named either MPV and
MPP, or MNPV and MNPP.)

In anticipation of a possible future extension for nested hypervisors,
you should also keep bit 14 free for another bit (MPV2).

4.
Krste wrote:
> rnmie should be settable but not clearable in M-mode to support
> nesting NMIs.

I think, when you lay out exactly, step by step, how nested NMIs would
work, you'll discover that's inadequate.  (Or, if I'm wrong, then you
have different plans for how it will work than I can guess, so, again,
it would be good to have this in writing as part of the proposal.)

    - John Hauser






Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Jonathan Behrens <behrensj@...>
 

I suppose it depends on the exact design of the watchdog timer. In the scheme I'm familiar with, an NMI arrives every few seconds regardless of the state of the system. Then the interrupt handler is responsible for checking whether the system is alive (which it is 99.999% of the time) and only becomes fatal if not.


On Thu, Jan 28, 2021 at 8:31 PM Allen Baum <allen.baum@...> wrote:
If you're not (or can't) resume the interrupted process - it's fatal. No forward progress is fatal, and I would classify watchdog timer interrupts as fatal. 
The performance monitoring example is not.

On Thu, Jan 28, 2021 at 5:06 PM Jonathan Behrens <behrensj@...> wrote:


On Thu, Jan 28, 2021 at 7:05 PM Richard Trauben via lists.riscv.org <rtrauben=gmail.com@...> wrote:
1) Is anyone using non maskable
interrupt for anything other
than fatal conditions?

a) if so,why not just increase the range of interrupt
priority levels, relocate the maskable interrupt priority handlers to the lower portion,
convert non fatal NMI sources into
maskable interrupts in the upper portion of the range?

One case where NMIs are used on x86 for non-fatal conditions is for performance monitoring. If an operating system wants to accurately measure its own performance, then ideally it should be able to sample the current instruction pointer even during critical sections where interrupts are disabled.

A second case where this comes up is for watchdog timers. If an operating system wants to detect if it ever becomes deadlocked with interrupts disabled, it can schedule an NMI every few seconds and verify if there has been any forward progress since the last one. In this case, the NMI itself isn't fatal even though failing one of these liveness checks will likely trigger a reboot (but hopefully after saving some diagnostic information to find out what went wrong).


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Allen Baum
 

If you're not (or can't) resume the interrupted process - it's fatal. No forward progress is fatal, and I would classify watchdog timer interrupts as fatal. 
The performance monitoring example is not.

On Thu, Jan 28, 2021 at 5:06 PM Jonathan Behrens <behrensj@...> wrote:


On Thu, Jan 28, 2021 at 7:05 PM Richard Trauben via lists.riscv.org <rtrauben=gmail.com@...> wrote:
1) Is anyone using non maskable
interrupt for anything other
than fatal conditions?

a) if so,why not just increase the range of interrupt
priority levels, relocate the maskable interrupt priority handlers to the lower portion,
convert non fatal NMI sources into
maskable interrupts in the upper portion of the range?

One case where NMIs are used on x86 for non-fatal conditions is for performance monitoring. If an operating system wants to accurately measure its own performance, then ideally it should be able to sample the current instruction pointer even during critical sections where interrupts are disabled.

A second case where this comes up is for watchdog timers. If an operating system wants to detect if it ever becomes deadlocked with interrupts disabled, it can schedule an NMI every few seconds and verify if there has been any forward progress since the last one. In this case, the NMI itself isn't fatal even though failing one of these liveness checks will likely trigger a reboot (but hopefully after saving some diagnostic information to find out what went wrong).


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Jonathan Behrens <behrensj@...>
 



On Thu, Jan 28, 2021 at 7:05 PM Richard Trauben via lists.riscv.org <rtrauben=gmail.com@...> wrote:
1) Is anyone using non maskable
interrupt for anything other
than fatal conditions?

a) if so,why not just increase the range of interrupt
priority levels, relocate the maskable interrupt priority handlers to the lower portion,
convert non fatal NMI sources into
maskable interrupts in the upper portion of the range?

One case where NMIs are used on x86 for non-fatal conditions is for performance monitoring. If an operating system wants to accurately measure its own performance, then ideally it should be able to sample the current instruction pointer even during critical sections where interrupts are disabled.

A second case where this comes up is for watchdog timers. If an operating system wants to detect if it ever becomes deadlocked with interrupts disabled, it can schedule an NMI every few seconds and verify if there has been any forward progress since the last one. In this case, the NMI itself isn't fatal even though failing one of these liveness checks will likely trigger a reboot (but hopefully after saving some diagnostic information to find out what went wrong).


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Allen Baum
 

My concern is that if an RNMI occurs during an exception handler, and the RNMI handler encounters an exception, the exception return point. cause, and prev. priv state is lost - that's fatal.
It's also no different than an exception handler encountering an exception, but it should be noted as something to take into account because an RNMI taken during an interrupt handler doesn't have that problem.


On Thu, Jan 28, 2021 at 4:14 PM Paul Donahue <pdonahue@...> wrote:
Thank you for pointing out that there are two RNMI vectors: one for NMI and one for exceptions.  As you surmised, I missed that but I think that addresses my concern.


-Paul

On Thu, Jan 28, 2021 at 2:34 PM John Hauser <jh.riscv@...> wrote:
1.
I think some of the questions and discussion concerning this proposal
have arisen because the proposal is a little too terse.  It would
be helpful to have more about the specific use cases that the RNMI
hardware is intended to support, and not support, and why, including
about nested NMIs (whether supported or not).  There seems to be an
assumption that the author and readers are already in agreement about
these matters, and that's much too optimistic, I think.

Some of the feedback has concerned the intersection of NMIs with
synchronous exceptions, such as what happens if a synchronous exception
occurs while in an NMI handler.  The proposal has this sentence:

    RNMI also has an associated exception trap handler address, which
    is implementation defined.

and then:

    If the hart encounters an exception while the `rnmie` bit is
    clear, the exception state is written to `mepc` and `mcause`,
    `mstatus.mpp` is set to M-mode, and the hart jumps to the RNMI
    exception handler address.

But that all went by too quickly for some reviewers, it seems.  (It may
also not be adequate in conjuction with nested NMIs.)

Not every reader will immediately intuit the clear distinction you
intend between _interrupt_ and _exception_, especially as even the
RISC-V Privileged Architecture sometimes muddies the two terms.
For example, note that an *interrupt trap* saves the interrupted
instruction address to one of the *Exception* Program Counter CSRs
such as mepc or mnepc, implying that the term _exception_ includes
interrupts.

2.
I urge that we not commit to CSR addresses 0x350-0x353 for the RNMI
CSRs.

As part of the planning for a possible future extension to improve
support for nested hypervisors, a pattern has been tentatively
established for M, S, U, VS, and V2S variants of CSRs, such as mstatus,
sstatus, ustatus, vsstatus, and a future v2sstatus.  The M-level
CSR addresses in the range 0x340-0x39F would be among those that can
support the full set of M, S, U, VS, and V2S variants of the same CSR.
But RNMI would never use the VS and V2S slots, and probably not the
U slot either.  It would be better to relocate the RNMI CSRs to other,
more constrained addresses, and leave the more flexible locations open
for CSRs that may need them.

To best manage the remaining CSR address space, I propose that the main
responsibility for assigning CSR addresses be given to the new Opcode
and Consistency Review group.

3.
Greg Favor wrote:
> - In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit
>   (for when the H extension is implemented and enabled)?

Krste:
> I'll let hypervisor authors address this.

Greg is correct.  When the hypervisor extension is implemented, the
existing 2-bit mstatus.MPP is extended with another bit, mstatus.MPV.
The saved operating mode is fully represented by 3 bits, MPV and MPP
together.

I would suggest allocating mnstatus bit 13 for the MPV-like bit.

(I'd expect these two fields in mnstatus to be named either MPV and
MPP, or MNPV and MNPP.)

In anticipation of a possible future extension for nested hypervisors,
you should also keep bit 14 free for another bit (MPV2).

4.
Krste wrote:
> rnmie should be settable but not clearable in M-mode to support
> nesting NMIs.

I think, when you lay out exactly, step by step, how nested NMIs would
work, you'll discover that's inadequate.  (Or, if I'm wrong, then you
have different plans for how it will work than I can guess, so, again,
it would be good to have this in writing as part of the proposal.)

    - John Hauser






Re: Resumable NMI proposal

Paul Donahue
 

Thank you for pointing out that there are two RNMI vectors: one for NMI and one for exceptions.  As you surmised, I missed that but I think that addresses my concern.


-Paul

On Thu, Jan 28, 2021 at 2:34 PM John Hauser <jh.riscv@...> wrote:
1.
I think some of the questions and discussion concerning this proposal
have arisen because the proposal is a little too terse.  It would
be helpful to have more about the specific use cases that the RNMI
hardware is intended to support, and not support, and why, including
about nested NMIs (whether supported or not).  There seems to be an
assumption that the author and readers are already in agreement about
these matters, and that's much too optimistic, I think.

Some of the feedback has concerned the intersection of NMIs with
synchronous exceptions, such as what happens if a synchronous exception
occurs while in an NMI handler.  The proposal has this sentence:

    RNMI also has an associated exception trap handler address, which
    is implementation defined.

and then:

    If the hart encounters an exception while the `rnmie` bit is
    clear, the exception state is written to `mepc` and `mcause`,
    `mstatus.mpp` is set to M-mode, and the hart jumps to the RNMI
    exception handler address.

But that all went by too quickly for some reviewers, it seems.  (It may
also not be adequate in conjuction with nested NMIs.)

Not every reader will immediately intuit the clear distinction you
intend between _interrupt_ and _exception_, especially as even the
RISC-V Privileged Architecture sometimes muddies the two terms.
For example, note that an *interrupt trap* saves the interrupted
instruction address to one of the *Exception* Program Counter CSRs
such as mepc or mnepc, implying that the term _exception_ includes
interrupts.

2.
I urge that we not commit to CSR addresses 0x350-0x353 for the RNMI
CSRs.

As part of the planning for a possible future extension to improve
support for nested hypervisors, a pattern has been tentatively
established for M, S, U, VS, and V2S variants of CSRs, such as mstatus,
sstatus, ustatus, vsstatus, and a future v2sstatus.  The M-level
CSR addresses in the range 0x340-0x39F would be among those that can
support the full set of M, S, U, VS, and V2S variants of the same CSR.
But RNMI would never use the VS and V2S slots, and probably not the
U slot either.  It would be better to relocate the RNMI CSRs to other,
more constrained addresses, and leave the more flexible locations open
for CSRs that may need them.

To best manage the remaining CSR address space, I propose that the main
responsibility for assigning CSR addresses be given to the new Opcode
and Consistency Review group.

3.
Greg Favor wrote:
> - In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit
>   (for when the H extension is implemented and enabled)?

Krste:
> I'll let hypervisor authors address this.

Greg is correct.  When the hypervisor extension is implemented, the
existing 2-bit mstatus.MPP is extended with another bit, mstatus.MPV.
The saved operating mode is fully represented by 3 bits, MPV and MPP
together.

I would suggest allocating mnstatus bit 13 for the MPV-like bit.

(I'd expect these two fields in mnstatus to be named either MPV and
MPP, or MNPV and MNPP.)

In anticipation of a possible future extension for nested hypervisors,
you should also keep bit 14 free for another bit (MPV2).

4.
Krste wrote:
> rnmie should be settable but not clearable in M-mode to support
> nesting NMIs.

I think, when you lay out exactly, step by step, how nested NMIs would
work, you'll discover that's inadequate.  (Or, if I'm wrong, then you
have different plans for how it will work than I can guess, so, again,
it would be good to have this in writing as part of the proposal.)

    - John Hauser






Re: Resumable NMI proposal

Richard Trauben
 

1) Is anyone using non maskable
interrupt for anything other
than fatal conditions?

a) if so,why not just increase the range of interrupt
priority levels, relocate the maskable interrupt priority handlers to the lower portion,
convert non fatal NMI sources into
maskable interrupts in the upper portion of the range?

2) if handling fatal interrupts is rare
(it better be), the probability of 2 back to back in close proximity or
actually having a hope of even fielding the subsequent fatal interrupt is small.

you could reserve the top say 8 priority fatal sources to ignore
the subsequent instances.

3) worst corne to worst you might partition
the M-mode privilege ring into two rings
a) an administrative
M1 (aka red mode) for handling fatal nmi irq icu events gracefully and
b) a business as usual M2 privilege ring for typical usage.

4) if you are worried about losing NMI perhaps a sorted elastic queue of pending NMI
might an option.

at some point recovery is no longer an option.

On Jan 28, 2021, at 2:34 PM, John Hauser <jh.riscv@...> wrote:

1.
I think some of the questions and discussion concerning this proposal
have arisen because the proposal is a little too terse. It would
be helpful to have more about the specific use cases that the RNMI
hardware is intended to support, and not support, and why, including
about nested NMIs (whether supported or not). There seems to be an
assumption that the author and readers are already in agreement about
these matters, and that's much too optimistic, I think.

Some of the feedback has concerned the intersection of NMIs with
synchronous exceptions, such as what happens if a synchronous exception
occurs while in an NMI handler. The proposal has this sentence:

RNMI also has an associated exception trap handler address, which
is implementation defined.

and then:

If the hart encounters an exception while the `rnmie` bit is
clear, the exception state is written to `mepc` and `mcause`,
`mstatus.mpp` is set to M-mode, and the hart jumps to the RNMI
exception handler address.

But that all went by too quickly for some reviewers, it seems. (It may
also not be adequate in conjuction with nested NMIs.)

Not every reader will immediately intuit the clear distinction you
intend between _interrupt_ and _exception_, especially as even the
RISC-V Privileged Architecture sometimes muddies the two terms.
For example, note that an *interrupt trap* saves the interrupted
instruction address to one of the *Exception* Program Counter CSRs
such as mepc or mnepc, implying that the term _exception_ includes
interrupts.

2.
I urge that we not commit to CSR addresses 0x350-0x353 for the RNMI
CSRs.

As part of the planning for a possible future extension to improve
support for nested hypervisors, a pattern has been tentatively
established for M, S, U, VS, and V2S variants of CSRs, such as mstatus,
sstatus, ustatus, vsstatus, and a future v2sstatus. The M-level
CSR addresses in the range 0x340-0x39F would be among those that can
support the full set of M, S, U, VS, and V2S variants of the same CSR.
But RNMI would never use the VS and V2S slots, and probably not the
U slot either. It would be better to relocate the RNMI CSRs to other,
more constrained addresses, and leave the more flexible locations open
for CSRs that may need them.

To best manage the remaining CSR address space, I propose that the main
responsibility for assigning CSR addresses be given to the new Opcode
and Consistency Review group.

3.
Greg Favor wrote:
- In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit
(for when the H extension is implemented and enabled)?
Krste:
I'll let hypervisor authors address this.
Greg is correct. When the hypervisor extension is implemented, the
existing 2-bit mstatus.MPP is extended with another bit, mstatus.MPV.
The saved operating mode is fully represented by 3 bits, MPV and MPP
together.

I would suggest allocating mnstatus bit 13 for the MPV-like bit.

(I'd expect these two fields in mnstatus to be named either MPV and
MPP, or MNPV and MNPP.)

In anticipation of a possible future extension for nested hypervisors,
you should also keep bit 14 free for another bit (MPV2).

4.
Krste wrote:
rnmie should be settable but not clearable in M-mode to support
nesting NMIs.
I think, when you lay out exactly, step by step, how nested NMIs would
work, you'll discover that's inadequate. (Or, if I'm wrong, then you
have different plans for how it will work than I can guess, so, again,
it would be good to have this in writing as part of the proposal.)

- John Hauser





Re: Resumable NMI proposal

John Hauser
 

1.
I think some of the questions and discussion concerning this proposal
have arisen because the proposal is a little too terse. It would
be helpful to have more about the specific use cases that the RNMI
hardware is intended to support, and not support, and why, including
about nested NMIs (whether supported or not). There seems to be an
assumption that the author and readers are already in agreement about
these matters, and that's much too optimistic, I think.

Some of the feedback has concerned the intersection of NMIs with
synchronous exceptions, such as what happens if a synchronous exception
occurs while in an NMI handler. The proposal has this sentence:

RNMI also has an associated exception trap handler address, which
is implementation defined.

and then:

If the hart encounters an exception while the `rnmie` bit is
clear, the exception state is written to `mepc` and `mcause`,
`mstatus.mpp` is set to M-mode, and the hart jumps to the RNMI
exception handler address.

But that all went by too quickly for some reviewers, it seems. (It may
also not be adequate in conjuction with nested NMIs.)

Not every reader will immediately intuit the clear distinction you
intend between _interrupt_ and _exception_, especially as even the
RISC-V Privileged Architecture sometimes muddies the two terms.
For example, note that an *interrupt trap* saves the interrupted
instruction address to one of the *Exception* Program Counter CSRs
such as mepc or mnepc, implying that the term _exception_ includes
interrupts.

2.
I urge that we not commit to CSR addresses 0x350-0x353 for the RNMI
CSRs.

As part of the planning for a possible future extension to improve
support for nested hypervisors, a pattern has been tentatively
established for M, S, U, VS, and V2S variants of CSRs, such as mstatus,
sstatus, ustatus, vsstatus, and a future v2sstatus. The M-level
CSR addresses in the range 0x340-0x39F would be among those that can
support the full set of M, S, U, VS, and V2S variants of the same CSR.
But RNMI would never use the VS and V2S slots, and probably not the
U slot either. It would be better to relocate the RNMI CSRs to other,
more constrained addresses, and leave the more flexible locations open
for CSRs that may need them.

To best manage the remaining CSR address space, I propose that the main
responsibility for assigning CSR addresses be given to the new Opcode
and Consistency Review group.

3.
Greg Favor wrote:
- In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit
(for when the H extension is implemented and enabled)?
Krste:
I'll let hypervisor authors address this.
Greg is correct. When the hypervisor extension is implemented, the
existing 2-bit mstatus.MPP is extended with another bit, mstatus.MPV.
The saved operating mode is fully represented by 3 bits, MPV and MPP
together.

I would suggest allocating mnstatus bit 13 for the MPV-like bit.

(I'd expect these two fields in mnstatus to be named either MPV and
MPP, or MNPV and MNPP.)

In anticipation of a possible future extension for nested hypervisors,
you should also keep bit 14 free for another bit (MPV2).

4.
Krste wrote:
rnmie should be settable but not clearable in M-mode to support
nesting NMIs.
I think, when you lay out exactly, step by step, how nested NMIs would
work, you'll discover that's inadequate. (Or, if I'm wrong, then you
have different plans for how it will work than I can guess, so, again,
it would be good to have this in writing as part of the proposal.)

- John Hauser


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Jonathan Behrens <behrensj@...>
 

Shouldn't exceptions during an NMI just jump to the normal exception handler? That seems like the simplest solution, and would be the behaviour if this extension didn't specify something different.

Jonathan


On Fri, Jan 22, 2021 at 11:50 AM Paul Donahue via lists.riscv.org <pdonahue=ventanamicro.com@...> wrote:
The problem is that, although RNMI has a separate vector, the RNMI vector can handle both NMI and normal exceptions but the cause information is only updated on NMI.

Here's an illustration of my concern:
  1. NMI happens
  2. hardware updates mncause and mnepc and goes to RNMI handler
  3. RNMI handler reads mncause to see what needs to be handled
  4. RNMI handler starts to handle the NMI.  So far so good.
  5. RNMI handler gets an exception
  6. hardware updates mcause and mepc (as explicitly contemplated in the proposal) and goes to RNMI handler
  7. The RNMI code in step 3 reads mncause which will still indicate NMI as the cause.  The handler needs to be somehow told to read mcause.  How does it get that indication?


Thanks,

-Paul


On Fri, Jan 22, 2021 at 6:19 AM <krste@...> wrote:

RNMI has separate exception vector from regular exceptions.

Krste

>>>>> On Tue, 19 Jan 2021 17:51:37 -0800, Allen Baum <allen.baum@...> said:

| Oops - I reread that - Paul is correct; nmcause is not modified. The handler has no guaranteed way of handling an exception.
| IT can leave breadcrumbs, but can't guarantee they will be correct depending on when an exception cocurrs.

| On Tue, Jan 19, 2021 at 5:46 PM Allen Baum <allen.baum@...> wrote:

|     I am assuming here that NMI is an interrupt, not an exception. Both trap, but an mmie-bit only affects interrupts, not exceptions.
|     So, on entry, the handler knows if this was caused by an NMI or by an exception from the rnmi handler itself by looking at the MSB of nmcause.

|     I am also assuming that:
|     RNMI interrupts are not delegated
|     Accesses made in the RNMI handler are made just as if the hart were in Mmode, so it has to save and restore MPRV if it wants to use it.

|     I'm not exactly sure why MNRET is needed at all - it can use the hidden MMIE bit to select returning to mepc or nmepc, and restoring the priv level
|     from nmstatus.pp rather than mstatus.pp, and inhibit popping the pp and ie stacks.
|     And what happens if you execute mret or sret in the rnmi handler (when mmie=0)? That doesn't sound good..

|     On Tue, Jan 19, 2021 at 2:52 PM Paul Donahue <pdonahue@...> wrote:

|         "If the hart encounters an exception while the `rnmie` bit is clear, the
|         exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|         set to M-mode, and the hart jumps to the RNMI exception handler
|         address."

|         How does the RNMI handler distinguish between entry due to an NMI and entry due to an exception while rnmie=0?  It needs this information to
|         decide whether the cause is in mcause or mncause.

|         On an exception when rnmie=0, is there a reason that mstatus.mpp is always set to M mode even if the exception occurred in some other mode?

|         Is medeleg ignored on traps when rnmie=0?  The handler address is specified but the handler mode is not.  Since the handler address is an M-mode
|         address (not subject to translation), I assume that the handler mode is always M and that medeleg must be ignored.

|         Thanks,

|         -Paul

|         On Tue, Jan 19, 2021 at 10:04 AM Earl Killian <earl.killian@...> wrote:

|             I would like a clarification on whether this replaces the existing NMI, or are you saying there are two different things, NMI and RNMI? I
|             doubt it, but I wanted to check.

|             I am concerned that the rnmie is hidden and only settable by MNRET. This means that to re-enable NMI for a portion of a NMI handler (e.g.
|             after saving all the appropriate information to a NMI stack) one must write mnepc and then MNRET there, but there is no way to disable it
|             for the real MNRET.

|| On Jan 18, 2021, at 18:39, Krste Asanovic <krste@...> wrote:
||
||
|| Current RISC-V specs only have a non-resumable NMI definition.  The
|| following proposal would add resumable NMI support.  This was one of
|| the features requested for priv 1.12 or RVA/RVM22.
||
|| This is up for discussion, but I think it is small enough to go
|| through fast track process.
||
|| Krste
||
|| :sectnums:
|| :toc: left
||
|| = Resumable NMI support in RISC-V
|| Version 0.2.1-Draft
||
|| == Background and Motivation
||
|| The RISC-V privileged architecture version 1.11 supports only
|| unresumable non-maskable interrupts (UNMIs), where the NMI jumps to a
|| handler in machine mode, overwriting the current `mepc` and `mcause`
|| register values.  If the hart had been executing machine-mode code in
|| a trap handler, the previous values in `mepc` and `mcause` would not
|| be recoverable and so execution is not generally resumable.
||
|| This proposal adds support for resumable non-maskable interrupts
|| (RNMIs) to RISC-V.  The extension adds four new CSRs (`mnepc`,
|| `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
|| and a new instruction to resume from the RNMI handler.
||
|| == RNMI Interrupt Signals
||
|| The `rnmi` interrupt signals are inputs to
|| the hart.  These interrupts have higher priority than any other
|| interrupt or exception on the hart and cannot be disabled by software.
|| Specifically, they are not disabled by clearing the `mstatus.mie`
|| register.
||
|| == RNMI Handler Addresses
||
|| The RNMI interrupt trap handler address is implementation-defined.
||
|| RNMI also has an associated exception trap handler address, which is
|| implementation defined.
||
|| == New RNMI CSRs
||
|| This proposal adds additional M-mode CSRs to enable a resumable
|| non-maskable interrupt (RNMI).
||
|| .NMI additional CSRs
|| [cols="2,2,2,2"]
|| [%autowidth]
|| |===
|| | Number | Privilege | Name        | Description
||
|| | 0x350  | MRW       | `mnscratch` | Resumable Non-maskable scratch register
|| | 0x351  | MRW       | `mnepc`     | Resumable Non-maskable EPC value
|| | 0x352  | MRW       | `mncause`   | Resumable Non-maskable cause value
|| | 0x353  | MRW       | `mnstatus`  | Resumable Non-maskable status
|| |===
||
|| The `mnscratch` CSR holds an XLEN-bit read-write register which
|| enables the NMI trap handler to save and restore the context that was
|| interrupted.
||
|| The `mnepc` CSR is an XLEN-bit read-write register which on entry
|| to the NMI trap handler holds the PC of the instruction that took the
|| interrupt. The lowest bit of `mnepc` is hardwired to zero.
||
|| The `mncause` CSR holds the reason for the NMI, with bit XLEN-1 set to
|| 1, and the NMI cause encoded in the least-significant bits or zero if
|| NMI causes are not supported.
||
|| The `mnstatus` CSR holds a two-bit field which on entry to the trap
|| handler holds the privilege mode of the interrupted context encoded in
|| bits `mnstatus[12:11]` in the same manner as `mstatus.mpp`.  The other
|| bits in `mnstatus` are _reserved_, but software should write zeros and
|| hardware implementations should return zeros.
||
|| == New MNRET instruction
||
|| This new M-mode only instruction uses the values in `mnepc` and
|| `mnstatus` to return to the program counter and privileged mode of the
|| interrupted context respectively.  This instruction also sets the
|| `rnmie` state bit.
||
|| MNRET instruction encoding is same as MRET except with bit 30 set
|| (i.e.,`funct7`=`0111000`).
||
|| == RNMI Operation
||
|| When an RNMI interrupt is detected, the interrupted PC is written to
|| the `mnepc` CSR, the type of RNMI to the `mncause` CSR, and the
|| privilege mode of the interrupted context to the `mnstatus` CSR.  An
|| internal microarchitectural state bit `rnmie` is cleared to indicate
|| that processor is in an RNMI handler and cannot take a new RNMI
|| interrupt.  The internal `rnmie` bit when clear also disables all
|| other interrupts.
||
|| NOTE: These interrupts are called non-maskable because software cannot
|| mask the interrupts, but for correct operation other instances of the
|| same interrupt must be held off until the handler is completed, hence
|| the internal state bit.
||
|| The core then enters machine-mode and jumps to the RNMI trap handler
|| address.
||
|| The RNMI handler can resume original execution using the new MNRET
|| instruction, which restores the PC from `mnepc`, the privilege mode
|| from `mnstatus`, and also sets the internal `rnmie` state bit, which
|| reenables other interrupts.
||
|| If the hart encounters an exception while the `rnmie` bit is clear, the
|| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|| set to M-mode, and the hart jumps to the RNMI exception handler
|| address.
||
|| NOTE: Traps in the RNMI handler can only be resumed if they occur while
|| the handler was servicing an interrupt that occured outside of
|| machine-mode.
||
|| == Interaction with debugger
||
|| The debugger can be configured such that an RNMI event drops the
|| system into the debugger.
||
||
||
||
||


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Paul Donahue
 

The problem is that, although RNMI has a separate vector, the RNMI vector can handle both NMI and normal exceptions but the cause information is only updated on NMI.

Here's an illustration of my concern:
  1. NMI happens
  2. hardware updates mncause and mnepc and goes to RNMI handler
  3. RNMI handler reads mncause to see what needs to be handled
  4. RNMI handler starts to handle the NMI.  So far so good.
  5. RNMI handler gets an exception
  6. hardware updates mcause and mepc (as explicitly contemplated in the proposal) and goes to RNMI handler
  7. The RNMI code in step 3 reads mncause which will still indicate NMI as the cause.  The handler needs to be somehow told to read mcause.  How does it get that indication?


Thanks,

-Paul


On Fri, Jan 22, 2021 at 6:19 AM <krste@...> wrote:

RNMI has separate exception vector from regular exceptions.

Krste

>>>>> On Tue, 19 Jan 2021 17:51:37 -0800, Allen Baum <allen.baum@...> said:

| Oops - I reread that - Paul is correct; nmcause is not modified. The handler has no guaranteed way of handling an exception.
| IT can leave breadcrumbs, but can't guarantee they will be correct depending on when an exception cocurrs.

| On Tue, Jan 19, 2021 at 5:46 PM Allen Baum <allen.baum@...> wrote:

|     I am assuming here that NMI is an interrupt, not an exception. Both trap, but an mmie-bit only affects interrupts, not exceptions.
|     So, on entry, the handler knows if this was caused by an NMI or by an exception from the rnmi handler itself by looking at the MSB of nmcause.

|     I am also assuming that:
|     RNMI interrupts are not delegated
|     Accesses made in the RNMI handler are made just as if the hart were in Mmode, so it has to save and restore MPRV if it wants to use it.

|     I'm not exactly sure why MNRET is needed at all - it can use the hidden MMIE bit to select returning to mepc or nmepc, and restoring the priv level
|     from nmstatus.pp rather than mstatus.pp, and inhibit popping the pp and ie stacks.
|     And what happens if you execute mret or sret in the rnmi handler (when mmie=0)? That doesn't sound good..

|     On Tue, Jan 19, 2021 at 2:52 PM Paul Donahue <pdonahue@...> wrote:

|         "If the hart encounters an exception while the `rnmie` bit is clear, the
|         exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|         set to M-mode, and the hart jumps to the RNMI exception handler
|         address."

|         How does the RNMI handler distinguish between entry due to an NMI and entry due to an exception while rnmie=0?  It needs this information to
|         decide whether the cause is in mcause or mncause.

|         On an exception when rnmie=0, is there a reason that mstatus.mpp is always set to M mode even if the exception occurred in some other mode?

|         Is medeleg ignored on traps when rnmie=0?  The handler address is specified but the handler mode is not.  Since the handler address is an M-mode
|         address (not subject to translation), I assume that the handler mode is always M and that medeleg must be ignored.

|         Thanks,

|         -Paul

|         On Tue, Jan 19, 2021 at 10:04 AM Earl Killian <earl.killian@...> wrote:

|             I would like a clarification on whether this replaces the existing NMI, or are you saying there are two different things, NMI and RNMI? I
|             doubt it, but I wanted to check.

|             I am concerned that the rnmie is hidden and only settable by MNRET. This means that to re-enable NMI for a portion of a NMI handler (e.g.
|             after saving all the appropriate information to a NMI stack) one must write mnepc and then MNRET there, but there is no way to disable it
|             for the real MNRET.

|| On Jan 18, 2021, at 18:39, Krste Asanovic <krste@...> wrote:
||
||
|| Current RISC-V specs only have a non-resumable NMI definition.  The
|| following proposal would add resumable NMI support.  This was one of
|| the features requested for priv 1.12 or RVA/RVM22.
||
|| This is up for discussion, but I think it is small enough to go
|| through fast track process.
||
|| Krste
||
|| :sectnums:
|| :toc: left
||
|| = Resumable NMI support in RISC-V
|| Version 0.2.1-Draft
||
|| == Background and Motivation
||
|| The RISC-V privileged architecture version 1.11 supports only
|| unresumable non-maskable interrupts (UNMIs), where the NMI jumps to a
|| handler in machine mode, overwriting the current `mepc` and `mcause`
|| register values.  If the hart had been executing machine-mode code in
|| a trap handler, the previous values in `mepc` and `mcause` would not
|| be recoverable and so execution is not generally resumable.
||
|| This proposal adds support for resumable non-maskable interrupts
|| (RNMIs) to RISC-V.  The extension adds four new CSRs (`mnepc`,
|| `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
|| and a new instruction to resume from the RNMI handler.
||
|| == RNMI Interrupt Signals
||
|| The `rnmi` interrupt signals are inputs to
|| the hart.  These interrupts have higher priority than any other
|| interrupt or exception on the hart and cannot be disabled by software.
|| Specifically, they are not disabled by clearing the `mstatus.mie`
|| register.
||
|| == RNMI Handler Addresses
||
|| The RNMI interrupt trap handler address is implementation-defined.
||
|| RNMI also has an associated exception trap handler address, which is
|| implementation defined.
||
|| == New RNMI CSRs
||
|| This proposal adds additional M-mode CSRs to enable a resumable
|| non-maskable interrupt (RNMI).
||
|| .NMI additional CSRs
|| [cols="2,2,2,2"]
|| [%autowidth]
|| |===
|| | Number | Privilege | Name        | Description
||
|| | 0x350  | MRW       | `mnscratch` | Resumable Non-maskable scratch register
|| | 0x351  | MRW       | `mnepc`     | Resumable Non-maskable EPC value
|| | 0x352  | MRW       | `mncause`   | Resumable Non-maskable cause value
|| | 0x353  | MRW       | `mnstatus`  | Resumable Non-maskable status
|| |===
||
|| The `mnscratch` CSR holds an XLEN-bit read-write register which
|| enables the NMI trap handler to save and restore the context that was
|| interrupted.
||
|| The `mnepc` CSR is an XLEN-bit read-write register which on entry
|| to the NMI trap handler holds the PC of the instruction that took the
|| interrupt. The lowest bit of `mnepc` is hardwired to zero.
||
|| The `mncause` CSR holds the reason for the NMI, with bit XLEN-1 set to
|| 1, and the NMI cause encoded in the least-significant bits or zero if
|| NMI causes are not supported.
||
|| The `mnstatus` CSR holds a two-bit field which on entry to the trap
|| handler holds the privilege mode of the interrupted context encoded in
|| bits `mnstatus[12:11]` in the same manner as `mstatus.mpp`.  The other
|| bits in `mnstatus` are _reserved_, but software should write zeros and
|| hardware implementations should return zeros.
||
|| == New MNRET instruction
||
|| This new M-mode only instruction uses the values in `mnepc` and
|| `mnstatus` to return to the program counter and privileged mode of the
|| interrupted context respectively.  This instruction also sets the
|| `rnmie` state bit.
||
|| MNRET instruction encoding is same as MRET except with bit 30 set
|| (i.e.,`funct7`=`0111000`).
||
|| == RNMI Operation
||
|| When an RNMI interrupt is detected, the interrupted PC is written to
|| the `mnepc` CSR, the type of RNMI to the `mncause` CSR, and the
|| privilege mode of the interrupted context to the `mnstatus` CSR.  An
|| internal microarchitectural state bit `rnmie` is cleared to indicate
|| that processor is in an RNMI handler and cannot take a new RNMI
|| interrupt.  The internal `rnmie` bit when clear also disables all
|| other interrupts.
||
|| NOTE: These interrupts are called non-maskable because software cannot
|| mask the interrupts, but for correct operation other instances of the
|| same interrupt must be held off until the handler is completed, hence
|| the internal state bit.
||
|| The core then enters machine-mode and jumps to the RNMI trap handler
|| address.
||
|| The RNMI handler can resume original execution using the new MNRET
|| instruction, which restores the PC from `mnepc`, the privilege mode
|| from `mnstatus`, and also sets the internal `rnmie` state bit, which
|| reenables other interrupts.
||
|| If the hart encounters an exception while the `rnmie` bit is clear, the
|| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|| set to M-mode, and the hart jumps to the RNMI exception handler
|| address.
||
|| NOTE: Traps in the RNMI handler can only be resumed if they occur while
|| the handler was servicing an interrupt that occured outside of
|| machine-mode.
||
|| == Interaction with debugger
||
|| The debugger can be configured such that an RNMI event drops the
|| system into the debugger.
||
||
||
||
||


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Jonathan Behrens <behrensj@...>
 



On Fri, Jan 22, 2021 at 9:16 AM Krste Asanovic via lists.riscv.org <krste=berkeley.edu@...> wrote:

>>>>> On Tue, 19 Jan 2021 10:36:26 -0600, Brian Grayson <brian.grayson@...> said:

| I'll jump in with a few more. :)
| From an architectural point of view, I don't like the fact that an NMI blocks further NMI until it leaves its handler. There may be NMIs that can save
| the state they need, and then unblock further NMIs, i.e., they might be stackable in some cases. This spec precludes that by making rnmie not
| software-writable.

| I also fundamentally dislike the use of the term "non-maskable", because the proposed rnmie bit literally masks non-maskable interrupts. That's an
| impossible thing to put into a spec, as remarked upon in your NOTE.

 rnmie should be settable but not clearable in M-mode to support
nesting NMIs.

Could you go into more detail on how nesting NMIs could work? Shouldn't it only be safe to execute MNRET with rnmie clear, because any NMI that came in between setting mnepc and executing MNRET would irreversibly clobber mnepc?

Jonathan


Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Krste Asanovic
 

RNMI has separate exception vector from regular exceptions.

Krste

On Tue, 19 Jan 2021 17:51:37 -0800, Allen Baum <allen.baum@...> said:
| Oops - I reread that - Paul is correct; nmcause is not modified. The handler has no guaranteed way of handling an exception.
| IT can leave breadcrumbs, but can't guarantee they will be correct depending on when an exception cocurrs.

| On Tue, Jan 19, 2021 at 5:46 PM Allen Baum <allen.baum@...> wrote:

| I am assuming here that NMI is an interrupt, not an exception. Both trap, but an mmie-bit only affects interrupts, not exceptions.
| So, on entry, the handler knows if this was caused by an NMI or by an exception from the rnmi handler itself by looking at the MSB of nmcause.

| I am also assuming that:
| RNMI interrupts are not delegated
| Accesses made in the RNMI handler are made just as if the hart were in Mmode, so it has to save and restore MPRV if it wants to use it.

| I'm not exactly sure why MNRET is needed at all - it can use the hidden MMIE bit to select returning to mepc or nmepc, and restoring the priv level
| from nmstatus.pp rather than mstatus.pp, and inhibit popping the pp and ie stacks.
| And what happens if you execute mret or sret in the rnmi handler (when mmie=0)? That doesn't sound good..

| On Tue, Jan 19, 2021 at 2:52 PM Paul Donahue <pdonahue@...> wrote:

| "If the hart encounters an exception while the `rnmie` bit is clear, the
| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
| set to M-mode, and the hart jumps to the RNMI exception handler
| address."

| How does the RNMI handler distinguish between entry due to an NMI and entry due to an exception while rnmie=0?  It needs this information to
| decide whether the cause is in mcause or mncause.

| On an exception when rnmie=0, is there a reason that mstatus.mpp is always set to M mode even if the exception occurred in some other mode?

| Is medeleg ignored on traps when rnmie=0?  The handler address is specified but the handler mode is not.  Since the handler address is an M-mode
| address (not subject to translation), I assume that the handler mode is always M and that medeleg must be ignored.

| Thanks,

| -Paul

| On Tue, Jan 19, 2021 at 10:04 AM Earl Killian <earl.killian@...> wrote:

| I would like a clarification on whether this replaces the existing NMI, or are you saying there are two different things, NMI and RNMI? I
| doubt it, but I wanted to check.

| I am concerned that the rnmie is hidden and only settable by MNRET. This means that to re-enable NMI for a portion of a NMI handler (e.g.
| after saving all the appropriate information to a NMI stack) one must write mnepc and then MNRET there, but there is no way to disable it
| for the real MNRET.

|| On Jan 18, 2021, at 18:39, Krste Asanovic <krste@...> wrote:
||
||
|| Current RISC-V specs only have a non-resumable NMI definition.  The
|| following proposal would add resumable NMI support.  This was one of
|| the features requested for priv 1.12 or RVA/RVM22.
||
|| This is up for discussion, but I think it is small enough to go
|| through fast track process.
||
|| Krste
||
|| :sectnums:
|| :toc: left
||
|| = Resumable NMI support in RISC-V
|| Version 0.2.1-Draft
||
|| == Background and Motivation
||
|| The RISC-V privileged architecture version 1.11 supports only
|| unresumable non-maskable interrupts (UNMIs), where the NMI jumps to a
|| handler in machine mode, overwriting the current `mepc` and `mcause`
|| register values.  If the hart had been executing machine-mode code in
|| a trap handler, the previous values in `mepc` and `mcause` would not
|| be recoverable and so execution is not generally resumable.
||
|| This proposal adds support for resumable non-maskable interrupts
|| (RNMIs) to RISC-V.  The extension adds four new CSRs (`mnepc`,
|| `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
|| and a new instruction to resume from the RNMI handler.
||
|| == RNMI Interrupt Signals
||
|| The `rnmi` interrupt signals are inputs to
|| the hart.  These interrupts have higher priority than any other
|| interrupt or exception on the hart and cannot be disabled by software.
|| Specifically, they are not disabled by clearing the `mstatus.mie`
|| register.
||
|| == RNMI Handler Addresses
||
|| The RNMI interrupt trap handler address is implementation-defined.
||
|| RNMI also has an associated exception trap handler address, which is
|| implementation defined.
||
|| == New RNMI CSRs
||
|| This proposal adds additional M-mode CSRs to enable a resumable
|| non-maskable interrupt (RNMI).
||
|| .NMI additional CSRs
|| [cols="2,2,2,2"]
|| [%autowidth]
|| |===
|| | Number | Privilege | Name        | Description
||
|| | 0x350  | MRW       | `mnscratch` | Resumable Non-maskable scratch register
|| | 0x351  | MRW       | `mnepc`     | Resumable Non-maskable EPC value
|| | 0x352  | MRW       | `mncause`   | Resumable Non-maskable cause value
|| | 0x353  | MRW       | `mnstatus`  | Resumable Non-maskable status
|| |===
||
|| The `mnscratch` CSR holds an XLEN-bit read-write register which
|| enables the NMI trap handler to save and restore the context that was
|| interrupted.
||
|| The `mnepc` CSR is an XLEN-bit read-write register which on entry
|| to the NMI trap handler holds the PC of the instruction that took the
|| interrupt. The lowest bit of `mnepc` is hardwired to zero.
||
|| The `mncause` CSR holds the reason for the NMI, with bit XLEN-1 set to
|| 1, and the NMI cause encoded in the least-significant bits or zero if
|| NMI causes are not supported.
||
|| The `mnstatus` CSR holds a two-bit field which on entry to the trap
|| handler holds the privilege mode of the interrupted context encoded in
|| bits `mnstatus[12:11]` in the same manner as `mstatus.mpp`.  The other
|| bits in `mnstatus` are _reserved_, but software should write zeros and
|| hardware implementations should return zeros.
||
|| == New MNRET instruction
||
|| This new M-mode only instruction uses the values in `mnepc` and
|| `mnstatus` to return to the program counter and privileged mode of the
|| interrupted context respectively.  This instruction also sets the
|| `rnmie` state bit.
||
|| MNRET instruction encoding is same as MRET except with bit 30 set
|| (i.e.,`funct7`=`0111000`).
||
|| == RNMI Operation
||
|| When an RNMI interrupt is detected, the interrupted PC is written to
|| the `mnepc` CSR, the type of RNMI to the `mncause` CSR, and the
|| privilege mode of the interrupted context to the `mnstatus` CSR.  An
|| internal microarchitectural state bit `rnmie` is cleared to indicate
|| that processor is in an RNMI handler and cannot take a new RNMI
|| interrupt.  The internal `rnmie` bit when clear also disables all
|| other interrupts.
||
|| NOTE: These interrupts are called non-maskable because software cannot
|| mask the interrupts, but for correct operation other instances of the
|| same interrupt must be held off until the handler is completed, hence
|| the internal state bit.
||
|| The core then enters machine-mode and jumps to the RNMI trap handler
|| address.
||
|| The RNMI handler can resume original execution using the new MNRET
|| instruction, which restores the PC from `mnepc`, the privilege mode
|| from `mnstatus`, and also sets the internal `rnmie` state bit, which
|| reenables other interrupts.
||
|| If the hart encounters an exception while the `rnmie` bit is clear, the
|| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|| set to M-mode, and the hart jumps to the RNMI exception handler
|| address.
||
|| NOTE: Traps in the RNMI handler can only be resumed if they occur while
|| the handler was servicing an interrupt that occured outside of
|| machine-mode.
||
|| == Interaction with debugger
||
|| The debugger can be configured such that an RNMI event drops the
|| system into the debugger.
||
||
||
||
||

|


Re: Resumable NMI proposal

Krste Asanovic
 

On Tue, 19 Jan 2021 10:04:28 -0800, Earl Killian <earl.killian@...> said:
| I would like a clarification on whether this replaces the existing NMI, or are you saying there are two different things, NMI and RNMI? I doubt it, but I wanted to check.

Enhances NMI by adding state to make resumable.

| I am concerned that the rnmie is hidden and only settable by MNRET. This means that to re-enable NMI for a portion of a NMI handler (e.g. after saving all the appropriate information to a NMI stack) one must write mnepc and then MNRET there, but there is no way to disable it for the real MNRET.

Should have been visible but only settable by M-mode software.

Krste


|| On Jan 18, 2021, at 18:39, Krste Asanovic <krste@...> wrote:
||
||
|| Current RISC-V specs only have a non-resumable NMI definition. The
|| following proposal would add resumable NMI support. This was one of
|| the features requested for priv 1.12 or RVA/RVM22.
||
|| This is up for discussion, but I think it is small enough to go
|| through fast track process.
||
|| Krste
||
|| :sectnums:
|| :toc: left
||
|| = Resumable NMI support in RISC-V
|| Version 0.2.1-Draft
||
|| == Background and Motivation
||
|| The RISC-V privileged architecture version 1.11 supports only
|| unresumable non-maskable interrupts (UNMIs), where the NMI jumps to a
|| handler in machine mode, overwriting the current `mepc` and `mcause`
|| register values. If the hart had been executing machine-mode code in
|| a trap handler, the previous values in `mepc` and `mcause` would not
|| be recoverable and so execution is not generally resumable.
||
|| This proposal adds support for resumable non-maskable interrupts
|| (RNMIs) to RISC-V. The extension adds four new CSRs (`mnepc`,
|| `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
|| and a new instruction to resume from the RNMI handler.
||
|| == RNMI Interrupt Signals
||
|| The `rnmi` interrupt signals are inputs to
|| the hart. These interrupts have higher priority than any other
|| interrupt or exception on the hart and cannot be disabled by software.
|| Specifically, they are not disabled by clearing the `mstatus.mie`
|| register.
||
|| == RNMI Handler Addresses
||
|| The RNMI interrupt trap handler address is implementation-defined.
||
|| RNMI also has an associated exception trap handler address, which is
|| implementation defined.
||
|| == New RNMI CSRs
||
|| This proposal adds additional M-mode CSRs to enable a resumable
|| non-maskable interrupt (RNMI).
||
|| .NMI additional CSRs
|| [cols="2,2,2,2"]
|| [%autowidth]
|| |===
|| | Number | Privilege | Name | Description
||
|| | 0x350 | MRW | `mnscratch` | Resumable Non-maskable scratch register
|| | 0x351 | MRW | `mnepc` | Resumable Non-maskable EPC value
|| | 0x352 | MRW | `mncause` | Resumable Non-maskable cause value
|| | 0x353 | MRW | `mnstatus` | Resumable Non-maskable status
|| |===
||
|| The `mnscratch` CSR holds an XLEN-bit read-write register which
|| enables the NMI trap handler to save and restore the context that was
|| interrupted.
||
|| The `mnepc` CSR is an XLEN-bit read-write register which on entry
|| to the NMI trap handler holds the PC of the instruction that took the
|| interrupt. The lowest bit of `mnepc` is hardwired to zero.
||
|| The `mncause` CSR holds the reason for the NMI, with bit XLEN-1 set to
|| 1, and the NMI cause encoded in the least-significant bits or zero if
|| NMI causes are not supported.
||
|| The `mnstatus` CSR holds a two-bit field which on entry to the trap
|| handler holds the privilege mode of the interrupted context encoded in
|| bits `mnstatus[12:11]` in the same manner as `mstatus.mpp`. The other
|| bits in `mnstatus` are _reserved_, but software should write zeros and
|| hardware implementations should return zeros.
||
|| == New MNRET instruction
||
|| This new M-mode only instruction uses the values in `mnepc` and
|| `mnstatus` to return to the program counter and privileged mode of the
|| interrupted context respectively. This instruction also sets the
|| `rnmie` state bit.
||
|| MNRET instruction encoding is same as MRET except with bit 30 set
|| (i.e.,`funct7`=`0111000`).
||
|| == RNMI Operation
||
|| When an RNMI interrupt is detected, the interrupted PC is written to
|| the `mnepc` CSR, the type of RNMI to the `mncause` CSR, and the
|| privilege mode of the interrupted context to the `mnstatus` CSR. An
|| internal microarchitectural state bit `rnmie` is cleared to indicate
|| that processor is in an RNMI handler and cannot take a new RNMI
|| interrupt. The internal `rnmie` bit when clear also disables all
|| other interrupts.
||
|| NOTE: These interrupts are called non-maskable because software cannot
|| mask the interrupts, but for correct operation other instances of the
|| same interrupt must be held off until the handler is completed, hence
|| the internal state bit.
||
|| The core then enters machine-mode and jumps to the RNMI trap handler
|| address.
||
|| The RNMI handler can resume original execution using the new MNRET
|| instruction, which restores the PC from `mnepc`, the privilege mode
|| from `mnstatus`, and also sets the internal `rnmie` state bit, which
|| reenables other interrupts.
||
|| If the hart encounters an exception while the `rnmie` bit is clear, the
|| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|| set to M-mode, and the hart jumps to the RNMI exception handler
|| address.
||
|| NOTE: Traps in the RNMI handler can only be resumed if they occur while
|| the handler was servicing an interrupt that occured outside of
|| machine-mode.
||
|| == Interaction with debugger
||
|| The debugger can be configured such that an RNMI event drops the
|| system into the debugger.
||
||
||
||
||


Re: Resumable NMI proposal

Krste Asanovic
 

M-mode or HS-mode can always create equivalent for S-mode.

I don't believe hardware delegation makes sense for NMIs.

Krste

On Tue, 19 Jan 2021 12:51:14 -0500, Jonathan Behrens <behrensj@...> said:
| I'd expect that S-mode software will also want to be able to receive RNMIs, so it is probably worth thinking through how that would work at the same
| time.

| Jonathan

| On Tue, Jan 19, 2021 at 11:42 AM Brian Grayson via lists.riscv.org <brian.grayson=sifive.com@...> wrote:

| I'll jump in with a few more. :)

| From an architectural point of view, I don't like the fact that an NMI blocks further NMI until it leaves its handler. There may be NMIs that can
| save the state they need, and then unblock further NMIs, i.e., they might be stackable in some cases. This spec precludes that by making rnmie not
| software-writable.

| I also fundamentally dislike the use of the term "non-maskable", because the proposed rnmie bit literally masks non-maskable interrupts. That's an
| impossible thing to put into a spec, as remarked upon in your NOTE.

| From a prior art point of view, the PowerPC Book E architecture (the embedded flavor) handled this differently. It called NMIs "critical
| interrupts", and all critical interrupts are higher priority than non-critical. They can be masked, and further critical interrupts are
| automatically masked when a critical interrupt is taken, but software in the critical interrupt handler can save the state it needs, and then
| re-enable further critical interrupts if that is desired. I believe this is ultimately the desired behavior for us as well, and is IMO better
| architectural terminology and a better mental framework.

| Within non-critical interrupts, some debug events were considered highest priority, some were considered lowest priority, depending on the type of
| debug event. It can be difficult (or even impossible) to assert that all debug events are all higher or all lower than a given exception event.

| See Chapter 7 "Interrupts and Exceptions" of https://www.nxp.com/docs/en/user-guide/BOOK_EUM.pdf for more details, and in particular 7.9.1 that
| talks about the relative priority of all the different types of interrupts in both classes, and where various debug interrupts fell.

| Back to the proposal, I think there's a mistake near the end:

| If the hart encounters an exception while the `rnmie` bit is clear, the
| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
| set to M-mode, and the hart jumps to the RNMI exception handler
| address.  

| If the enable bit is clear, the exception can't be taken, right? Is this supposed to discuss the case of an ordinary exception being observed while
| in RNMI, where the exception will actually be instantly taken once we leave RNMI state, in which case we would jump to the ordinary exception
| handler? Or am I misunderstanding?

| Brian

| On Mon, Jan 18, 2021 at 9:09 PM Greg Favor <gfavor@...> wrote:

| Even though this is hot off the press, I'll jump in with a few small comments:

| - In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit (for when the H extension is implemented and enabled)?

| - The width of the mnstatus CSR is not explicitly defined (e.g. as an XLEN-bit or 32-bit read-write register).

| - What is the relative priority for RNMI versus Debug Halt Request?  Maybe it is the responsibility ultimately for the Debug spec to specify
| this (?), but what should be said over there?

| - Can you briefly comment on the types of systems that show the most need for recoverable NMIs (versus being not ideal but ok with the current
| NMI situation).  Semi-equivalently, would you expect this extension to eventually become a requirement (or optional) in RVA22 and/or RVM22?

| Greg

| On Mon, Jan 18, 2021 at 6:39 PM Krste Asanovic <krste@...> wrote:

| Current RISC-V specs only have a non-resumable NMI definition.  The
| following proposal would add resumable NMI support.  This was one of
| the features requested for priv 1.12 or RVA/RVM22.

| This is up for discussion, but I think it is small enough to go
| through fast track process.

| Krste

| :sectnums:
| :toc: left

| = Resumable NMI support in RISC-V
| Version 0.2.1-Draft

| == Background and Motivation

| The RISC-V privileged architecture version 1.11 supports only
| unresumable non-maskable interrupts (UNMIs), where the NMI jumps to a
| handler in machine mode, overwriting the current `mepc` and `mcause`
| register values.  If the hart had been executing machine-mode code in
| a trap handler, the previous values in `mepc` and `mcause` would not
| be recoverable and so execution is not generally resumable.

| This proposal adds support for resumable non-maskable interrupts
| (RNMIs) to RISC-V.  The extension adds four new CSRs (`mnepc`,
| `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
| and a new instruction to resume from the RNMI handler.

| == RNMI Interrupt Signals

| The `rnmi` interrupt signals are inputs to
| the hart.  These interrupts have higher priority than any other
| interrupt or exception on the hart and cannot be disabled by software.
| Specifically, they are not disabled by clearing the `mstatus.mie`
| register.

| == RNMI Handler Addresses

| The RNMI interrupt trap handler address is implementation-defined.

| RNMI also has an associated exception trap handler address, which is
| implementation defined.

| == New RNMI CSRs

| This proposal adds additional M-mode CSRs to enable a resumable
| non-maskable interrupt (RNMI).

| .NMI additional CSRs
| [cols="2,2,2,2"]
| [%autowidth]
| |===
| | Number | Privilege | Name        | Description

| | 0x350  | MRW       | `mnscratch` | Resumable Non-maskable scratch register
| | 0x351  | MRW       | `mnepc`     | Resumable Non-maskable EPC value
| | 0x352  | MRW       | `mncause`   | Resumable Non-maskable cause value
| | 0x353  | MRW       | `mnstatus`  | Resumable Non-maskable status
| |===

| The `mnscratch` CSR holds an XLEN-bit read-write register which
| enables the NMI trap handler to save and restore the context that was
| interrupted.

| The `mnepc` CSR is an XLEN-bit read-write register which on entry
| to the NMI trap handler holds the PC of the instruction that took the
| interrupt. The lowest bit of `mnepc` is hardwired to zero.

| The `mncause` CSR holds the reason for the NMI, with bit XLEN-1 set to
| 1, and the NMI cause encoded in the least-significant bits or zero if
| NMI causes are not supported.

| The `mnstatus` CSR holds a two-bit field which on entry to the trap
| handler holds the privilege mode of the interrupted context encoded in
| bits `mnstatus[12:11]` in the same manner as `mstatus.mpp`.  The other
| bits in `mnstatus` are _reserved_, but software should write zeros and
| hardware implementations should return zeros.

| == New MNRET instruction

| This new M-mode only instruction uses the values in `mnepc` and
| `mnstatus` to return to the program counter and privileged mode of the
| interrupted context respectively.  This instruction also sets the
| `rnmie` state bit.

| MNRET instruction encoding is same as MRET except with bit 30 set
| (i.e.,`funct7`=`0111000`).

| == RNMI Operation

| When an RNMI interrupt is detected, the interrupted PC is written to
| the `mnepc` CSR, the type of RNMI to the `mncause` CSR, and the
| privilege mode of the interrupted context to the `mnstatus` CSR.  An
| internal microarchitectural state bit `rnmie` is cleared to indicate
| that processor is in an RNMI handler and cannot take a new RNMI
| interrupt.  The internal `rnmie` bit when clear also disables all
| other interrupts.

| NOTE: These interrupts are called non-maskable because software cannot
| mask the interrupts, but for correct operation other instances of the
| same interrupt must be held off until the handler is completed, hence
| the internal state bit.

| The core then enters machine-mode and jumps to the RNMI trap handler
| address.

| The RNMI handler can resume original execution using the new MNRET
| instruction, which restores the PC from `mnepc`, the privilege mode
| from `mnstatus`, and also sets the internal `rnmie` state bit, which
| reenables other interrupts.

| If the hart encounters an exception while the `rnmie` bit is clear, the
| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
| set to M-mode, and the hart jumps to the RNMI exception handler
| address.

| NOTE: Traps in the RNMI handler can only be resumed if they occur while
| the handler was servicing an interrupt that occured outside of
| machine-mode.

| == Interaction with debugger

| The debugger can be configured such that an RNMI event drops the
| system into the debugger.

|


Re: Resumable NMI proposal

Krste Asanovic
 

On Tue, 19 Jan 2021 10:36:26 -0600, Brian Grayson <brian.grayson@...> said:
| I'll jump in with a few more. :)
| From an architectural point of view, I don't like the fact that an NMI blocks further NMI until it leaves its handler. There may be NMIs that can save
| the state they need, and then unblock further NMIs, i.e., they might be stackable in some cases. This spec precludes that by making rnmie not
| software-writable.

| I also fundamentally dislike the use of the term "non-maskable", because the proposed rnmie bit literally masks non-maskable interrupts. That's an
| impossible thing to put into a spec, as remarked upon in your NOTE.

rnmie should be settable but not clearable in M-mode to support
nesting NMIs.

(This was an editing error on my part, as I simplified earlier spec
but didn't include this mod.)

This still makes them non-maskable by software; NMI is
industry-standard term.

| From a prior art point of view, the PowerPC Book E architecture (the embedded flavor) handled this differently. It called NMIs "critical interrupts",
| and all critical interrupts are higher priority than non-critical. They can be masked, and further critical interrupts are automatically masked when a
| critical interrupt is taken, but software in the critical interrupt handler can save the state it needs, and then re-enable further critical interrupts
| if that is desired. I believe this is ultimately the desired behavior for us as well, and is IMO better architectural terminology and a better mental
| framework.

Fast interrupts proposal has pre-emptible levels intended for embedded
systems.

I think these are qualitatively different than NMI, which are needed
in application processors too.

| Within non-critical interrupts, some debug events were considered highest priority, some were considered lowest priority, depending on the type of debug
| event. It can be difficult (or even impossible) to assert that all debug events are all higher or all lower than a given exception event.

| See Chapter 7 "Interrupts and Exceptions" of https://www.nxp.com/docs/en/user-guide/BOOK_EUM.pdf for more details, and in particular 7.9.1 that talks
| about the relative priority of all the different types of interrupts in both classes, and where various debug interrupts fell.

| Back to the proposal, I think there's a mistake near the end:

| If the hart encounters an exception while the `rnmie` bit is clear, the
| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
| set to M-mode, and the hart jumps to the RNMI exception handler
| address.  

| If the enable bit is clear, the exception can't be taken, right? Is this supposed to discuss the case of an ordinary exception being observed while in
| RNMI, where the exception will actually be instantly taken once we leave RNMI state, in which case we would jump to the ordinary exception handler? Or
| am I misunderstanding?

Exceptions are not interrupts. NMI exceptions go to different vector address
than regular exceptions,

Krste

| Brian

| On Mon, Jan 18, 2021 at 9:09 PM Greg Favor <gfavor@...> wrote:

| Even though this is hot off the press, I'll jump in with a few small comments:

| - In mnstatus, shouldn't there also be a bit like the mstatus.MPV bit (for when the H extension is implemented and enabled)?

| - The width of the mnstatus CSR is not explicitly defined (e.g. as an XLEN-bit or 32-bit read-write register).

| - What is the relative priority for RNMI versus Debug Halt Request?  Maybe it is the responsibility ultimately for the Debug spec to specify this
| (?), but what should be said over there?

| - Can you briefly comment on the types of systems that show the most need for recoverable NMIs (versus being not ideal but ok with the current NMI
| situation).  Semi-equivalently, would you expect this extension to eventually become a requirement (or optional) in RVA22 and/or RVM22?

| Greg

| On Mon, Jan 18, 2021 at 6:39 PM Krste Asanovic <krste@...> wrote:

| Current RISC-V specs only have a non-resumable NMI definition.  The
| following proposal would add resumable NMI support.  This was one of
| the features requested for priv 1.12 or RVA/RVM22.

| This is up for discussion, but I think it is small enough to go
| through fast track process.

| Krste

| :sectnums:
| :toc: left

| = Resumable NMI support in RISC-V
| Version 0.2.1-Draft

| == Background and Motivation

| The RISC-V privileged architecture version 1.11 supports only
| unresumable non-maskable interrupts (UNMIs), where the NMI jumps to a
| handler in machine mode, overwriting the current `mepc` and `mcause`
| register values.  If the hart had been executing machine-mode code in
| a trap handler, the previous values in `mepc` and `mcause` would not
| be recoverable and so execution is not generally resumable.

| This proposal adds support for resumable non-maskable interrupts
| (RNMIs) to RISC-V.  The extension adds four new CSRs (`mnepc`,
| `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
| and a new instruction to resume from the RNMI handler.

| == RNMI Interrupt Signals

| The `rnmi` interrupt signals are inputs to
| the hart.  These interrupts have higher priority than any other
| interrupt or exception on the hart and cannot be disabled by software.
| Specifically, they are not disabled by clearing the `mstatus.mie`
| register.

| == RNMI Handler Addresses

| The RNMI interrupt trap handler address is implementation-defined.

| RNMI also has an associated exception trap handler address, which is
| implementation defined.

| == New RNMI CSRs

| This proposal adds additional M-mode CSRs to enable a resumable
| non-maskable interrupt (RNMI).

| .NMI additional CSRs
| [cols="2,2,2,2"]
| [%autowidth]
| |===
| | Number | Privilege | Name        | Description

| | 0x350  | MRW       | `mnscratch` | Resumable Non-maskable scratch register
| | 0x351  | MRW       | `mnepc`     | Resumable Non-maskable EPC value
| | 0x352  | MRW       | `mncause`   | Resumable Non-maskable cause value
| | 0x353  | MRW       | `mnstatus`  | Resumable Non-maskable status
| |===

| The `mnscratch` CSR holds an XLEN-bit read-write register which
| enables the NMI trap handler to save and restore the context that was
| interrupted.

| The `mnepc` CSR is an XLEN-bit read-write register which on entry
| to the NMI trap handler holds the PC of the instruction that took the
| interrupt. The lowest bit of `mnepc` is hardwired to zero.

| The `mncause` CSR holds the reason for the NMI, with bit XLEN-1 set to
| 1, and the NMI cause encoded in the least-significant bits or zero if
| NMI causes are not supported.

| The `mnstatus` CSR holds a two-bit field which on entry to the trap
| handler holds the privilege mode of the interrupted context encoded in
| bits `mnstatus[12:11]` in the same manner as `mstatus.mpp`.  The other
| bits in `mnstatus` are _reserved_, but software should write zeros and
| hardware implementations should return zeros.

| == New MNRET instruction

| This new M-mode only instruction uses the values in `mnepc` and
| `mnstatus` to return to the program counter and privileged mode of the
| interrupted context respectively.  This instruction also sets the
| `rnmie` state bit.

| MNRET instruction encoding is same as MRET except with bit 30 set
| (i.e.,`funct7`=`0111000`).

| == RNMI Operation

| When an RNMI interrupt is detected, the interrupted PC is written to
| the `mnepc` CSR, the type of RNMI to the `mncause` CSR, and the
| privilege mode of the interrupted context to the `mnstatus` CSR.  An
| internal microarchitectural state bit `rnmie` is cleared to indicate
| that processor is in an RNMI handler and cannot take a new RNMI
| interrupt.  The internal `rnmie` bit when clear also disables all
| other interrupts.

| NOTE: These interrupts are called non-maskable because software cannot
| mask the interrupts, but for correct operation other instances of the
| same interrupt must be held off until the handler is completed, hence
| the internal state bit.

| The core then enters machine-mode and jumps to the RNMI trap handler
| address.

| The RNMI handler can resume original execution using the new MNRET
| instruction, which restores the PC from `mnepc`, the privilege mode
| from `mnstatus`, and also sets the internal `rnmie` state bit, which
| reenables other interrupts.

| If the hart encounters an exception while the `rnmie` bit is clear, the
| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
| set to M-mode, and the hart jumps to the RNMI exception handler
| address.

| NOTE: Traps in the RNMI handler can only be resumed if they occur while
| the handler was servicing an interrupt that occured outside of
| machine-mode.

| == Interaction with debugger

| The debugger can be configured such that an RNMI event drops the
| system into the debugger.

|

741 - 760 of 1189