Re: [RISC-V] [tech-fast-int] [RISC-V] [tech-privileged] Resumable NMI proposal

Jonathan Behrens <behrensj@...>

Shouldn't exceptions during an NMI just jump to the normal exception handler? That seems like the simplest solution, and would be the behaviour if this extension didn't specify something different.


On Fri, Jan 22, 2021 at 11:50 AM Paul Donahue via <> wrote:
The problem is that, although RNMI has a separate vector, the RNMI vector can handle both NMI and normal exceptions but the cause information is only updated on NMI.

Here's an illustration of my concern:
  1. NMI happens
  2. hardware updates mncause and mnepc and goes to RNMI handler
  3. RNMI handler reads mncause to see what needs to be handled
  4. RNMI handler starts to handle the NMI.  So far so good.
  5. RNMI handler gets an exception
  6. hardware updates mcause and mepc (as explicitly contemplated in the proposal) and goes to RNMI handler
  7. The RNMI code in step 3 reads mncause which will still indicate NMI as the cause.  The handler needs to be somehow told to read mcause.  How does it get that indication?



On Fri, Jan 22, 2021 at 6:19 AM <krste@...> wrote:

RNMI has separate exception vector from regular exceptions.


>>>>> On Tue, 19 Jan 2021 17:51:37 -0800, Allen Baum <allen.baum@...> said:

| Oops - I reread that - Paul is correct; nmcause is not modified. The handler has no guaranteed way of handling an exception.
| IT can leave breadcrumbs, but can't guarantee they will be correct depending on when an exception cocurrs.

| On Tue, Jan 19, 2021 at 5:46 PM Allen Baum <allen.baum@...> wrote:

|     I am assuming here that NMI is an interrupt, not an exception. Both trap, but an mmie-bit only affects interrupts, not exceptions.
|     So, on entry, the handler knows if this was caused by an NMI or by an exception from the rnmi handler itself by looking at the MSB of nmcause.

|     I am also assuming that:
|     RNMI interrupts are not delegated
|     Accesses made in the RNMI handler are made just as if the hart were in Mmode, so it has to save and restore MPRV if it wants to use it.

|     I'm not exactly sure why MNRET is needed at all - it can use the hidden MMIE bit to select returning to mepc or nmepc, and restoring the priv level
|     from nmstatus.pp rather than mstatus.pp, and inhibit popping the pp and ie stacks.
|     And what happens if you execute mret or sret in the rnmi handler (when mmie=0)? That doesn't sound good..

|     On Tue, Jan 19, 2021 at 2:52 PM Paul Donahue <pdonahue@...> wrote:

|         "If the hart encounters an exception while the `rnmie` bit is clear, the
|         exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|         set to M-mode, and the hart jumps to the RNMI exception handler
|         address."

|         How does the RNMI handler distinguish between entry due to an NMI and entry due to an exception while rnmie=0?  It needs this information to
|         decide whether the cause is in mcause or mncause.

|         On an exception when rnmie=0, is there a reason that mstatus.mpp is always set to M mode even if the exception occurred in some other mode?

|         Is medeleg ignored on traps when rnmie=0?  The handler address is specified but the handler mode is not.  Since the handler address is an M-mode
|         address (not subject to translation), I assume that the handler mode is always M and that medeleg must be ignored.

|         Thanks,

|         -Paul

|         On Tue, Jan 19, 2021 at 10:04 AM Earl Killian <earl.killian@...> wrote:

|             I would like a clarification on whether this replaces the existing NMI, or are you saying there are two different things, NMI and RNMI? I
|             doubt it, but I wanted to check.

|             I am concerned that the rnmie is hidden and only settable by MNRET. This means that to re-enable NMI for a portion of a NMI handler (e.g.
|             after saving all the appropriate information to a NMI stack) one must write mnepc and then MNRET there, but there is no way to disable it
|             for the real MNRET.

|| On Jan 18, 2021, at 18:39, Krste Asanovic <krste@...> wrote:
|| Current RISC-V specs only have a non-resumable NMI definition.  The
|| following proposal would add resumable NMI support.  This was one of
|| the features requested for priv 1.12 or RVA/RVM22.
|| This is up for discussion, but I think it is small enough to go
|| through fast track process.
|| Krste
|| :sectnums:
|| :toc: left
|| = Resumable NMI support in RISC-V
|| Version 0.2.1-Draft
|| == Background and Motivation
|| The RISC-V privileged architecture version 1.11 supports only
|| unresumable non-maskable interrupts (UNMIs), where the NMI jumps to a
|| handler in machine mode, overwriting the current `mepc` and `mcause`
|| register values.  If the hart had been executing machine-mode code in
|| a trap handler, the previous values in `mepc` and `mcause` would not
|| be recoverable and so execution is not generally resumable.
|| This proposal adds support for resumable non-maskable interrupts
|| (RNMIs) to RISC-V.  The extension adds four new CSRs (`mnepc`,
|| `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
|| and a new instruction to resume from the RNMI handler.
|| == RNMI Interrupt Signals
|| The `rnmi` interrupt signals are inputs to
|| the hart.  These interrupts have higher priority than any other
|| interrupt or exception on the hart and cannot be disabled by software.
|| Specifically, they are not disabled by clearing the `mstatus.mie`
|| register.
|| == RNMI Handler Addresses
|| The RNMI interrupt trap handler address is implementation-defined.
|| RNMI also has an associated exception trap handler address, which is
|| implementation defined.
|| == New RNMI CSRs
|| This proposal adds additional M-mode CSRs to enable a resumable
|| non-maskable interrupt (RNMI).
|| .NMI additional CSRs
|| [cols="2,2,2,2"]
|| [%autowidth]
|| |===
|| | Number | Privilege | Name        | Description
|| | 0x350  | MRW       | `mnscratch` | Resumable Non-maskable scratch register
|| | 0x351  | MRW       | `mnepc`     | Resumable Non-maskable EPC value
|| | 0x352  | MRW       | `mncause`   | Resumable Non-maskable cause value
|| | 0x353  | MRW       | `mnstatus`  | Resumable Non-maskable status
|| |===
|| The `mnscratch` CSR holds an XLEN-bit read-write register which
|| enables the NMI trap handler to save and restore the context that was
|| interrupted.
|| The `mnepc` CSR is an XLEN-bit read-write register which on entry
|| to the NMI trap handler holds the PC of the instruction that took the
|| interrupt. The lowest bit of `mnepc` is hardwired to zero.
|| The `mncause` CSR holds the reason for the NMI, with bit XLEN-1 set to
|| 1, and the NMI cause encoded in the least-significant bits or zero if
|| NMI causes are not supported.
|| The `mnstatus` CSR holds a two-bit field which on entry to the trap
|| handler holds the privilege mode of the interrupted context encoded in
|| bits `mnstatus[12:11]` in the same manner as `mstatus.mpp`.  The other
|| bits in `mnstatus` are _reserved_, but software should write zeros and
|| hardware implementations should return zeros.
|| == New MNRET instruction
|| This new M-mode only instruction uses the values in `mnepc` and
|| `mnstatus` to return to the program counter and privileged mode of the
|| interrupted context respectively.  This instruction also sets the
|| `rnmie` state bit.
|| MNRET instruction encoding is same as MRET except with bit 30 set
|| (i.e.,`funct7`=`0111000`).
|| == RNMI Operation
|| When an RNMI interrupt is detected, the interrupted PC is written to
|| the `mnepc` CSR, the type of RNMI to the `mncause` CSR, and the
|| privilege mode of the interrupted context to the `mnstatus` CSR.  An
|| internal microarchitectural state bit `rnmie` is cleared to indicate
|| that processor is in an RNMI handler and cannot take a new RNMI
|| interrupt.  The internal `rnmie` bit when clear also disables all
|| other interrupts.
|| NOTE: These interrupts are called non-maskable because software cannot
|| mask the interrupts, but for correct operation other instances of the
|| same interrupt must be held off until the handler is completed, hence
|| the internal state bit.
|| The core then enters machine-mode and jumps to the RNMI trap handler
|| address.
|| The RNMI handler can resume original execution using the new MNRET
|| instruction, which restores the PC from `mnepc`, the privilege mode
|| from `mnstatus`, and also sets the internal `rnmie` state bit, which
|| reenables other interrupts.
|| If the hart encounters an exception while the `rnmie` bit is clear, the
|| exception state is written to `mepc` and `mcause`, `mstatus.mpp` is
|| set to M-mode, and the hart jumps to the RNMI exception handler
|| address.
|| NOTE: Traps in the RNMI handler can only be resumed if they occur while
|| the handler was servicing an interrupt that occured outside of
|| machine-mode.
|| == Interaction with debugger
|| The debugger can be configured such that an RNMI event drops the
|| system into the debugger.

Join to automatically receive all group messages.