Re: Preferred manner of supporting bus errors in RISC-V


Greg Chadwick
 

Hello,

Thanks for raising this Arjan, it's been a low-priority item on my TODO list to
open a discussion on bus errors for a while now (I work on Ibex amongst other
things at lowRISC).

I think RISC-V should allow implementation to choose whether not they want
precise or imprecise bus errors, which I think is the case now. However as you
point out the specification is pretty silent on the matter. Some wording around
what the possibilities might be and ensuring the specification doesn't prevent
certain options from working without good reason seems prudent.

In particular we have the issue of the mcause exception code for bus errors that
you raise. I believe codes 1,5 and 7 are meant to be PMP faults only. Ibex is
non-confirming at the moment due to its use of the same code for both PMP and
bus errors. I think SweRV may do the same (look at the EH1 source here:
https://github.com/chipsalliance/Cores-SweRV/blob/7332edc0adaa7e9a0c842d169154429e8d987786/design/lsu/lsu_lsc_ctl.sv#L211
when generating its exception packet it combines access and bus errors together
and only alters type for misaligned or not).  The Andes/Gowin N25 also looks to
use the PMP mcause codes for precise bus errors (see page 87 of
https://www.gowinsemi.com/upload/database_doc/586/document_ja/5de4c10ca33c9.pdf)

I don't really mind if we introduce a new code here or broaden the definition of
'access fault' to include non PMP errors like bus errors. It could even be left
implementation defined though I'd prefer a specification defined bus error
mcause.

I did also have some concerns around how precise bus errors interact with
interrupts. In particular if you have an outstanding memory access (that may or
may not see a bus error) and receive an interrupt is it permissible to
effectively ignore the interrupt until the potential bus error is resolved?
Again I think the specification gives implementations room to do different
things here as it's up to the implementation how an interrupt becomes pending
(see some extensive discussion here:
https://github.com/riscv/riscv-isa-manual/issues/544) some extra wording
somewhere to make it clear this is a possibility could be useful.

Cheers,

Greg Chadwick

On Wed, Feb 3, 2021 at 11:35 AM Arjan Bink <Arjan.Bink@...> wrote:

Hi all,

 

We want to add support for ‘bus errors’ in our RISC-V design (e.g. signaled via AXI bresp/rresp signals). I studied a couple of different RISC-V architectures and I do not see a common approach for dealing with this.

 

Some examples:

 

  • SiFive uses a ‘bus error unit’ that converts bus errors into regular interrupts
  • Ibex implements precise bus errors and causes exceptions using RISC-V defined mcause exception codes (i.e. instruction access fault (exception code 1), load access fault (exception code 5), store/AMO access fault (exception code 7)
  • SweRV-EL2 maps imprecise bus errors onto custom NMIs (and they also have precise bus errors).

 

The RISC-V Privileged specification hardly mentions this topics, but has the following quotes that might be related:

 

“Non-maskable interrupts (NMIs) are only used for hardware error conditions”

“Precise PMA traps might not always be possible, for example, when probing a legacy bus architecture that uses access failures as part of the discovery mechanism. In this case, error responses from slave devices will be reported as imprecise bus-error interrupts.”

 

In our design we will have a PMP (so exception codes 1, 5, 7 are used to report precise PMP exceptions), precise instruction bus errors, and imprecise data bus errors. What is the intended manner of dealing with these precise instruction bus errors and imprecise data bus errors? Should we cause NMIs for them? Should we map them to a regular exception non-interrupt) with mcause exception codes 1, 5, 7 (which would be confusing as software can then not distinguish them from the PMP errors and also code 5 and 7 would be used for both precise PMP exceptions and imprecise data bus exceptions). Usage of an external ‘bus error unit’ does not seem appropriate as it could easily cause an interrupt on a speculative (and never actually executed) instruction fetch.

 

So, is there any common or recommended manner of dealing with bus errors?

 

Best regards,

Arjan

Join tech-privileged@lists.riscv.org to automatically receive all group messages.