Re: [PATCH] Add direct memory access synchronize extension


Anup Patel
 

-----Original Message-----
From: Nick Kossifidis <mick@...>
Sent: 05 June 2021 19:32
To: Atish Patra <Atish.Patra@...>
Cc: mick@...; Anup Patel <Anup.Patel@...>; tech-
unixplatformspec@...
Subject: Re: [RISC-V] [tech-unixplatformspec] [PATCH] Add direct memory
access synchronize extension

Στις 2021-06-04 23:01, Atish Patra έγραψε:

The firmware code will still be executed while the priv mode is S-mode
right ?

Wouldn't that violate the priv spec ?
M-mode can share a code region with S-mode using PMP/ePMP and let S-
mode map that region as executable on its address space. With the current
PMP M-mode can define a region as RX for example and both M-mode and
S/U-mode will have RX permissions there, with ePMP M-mode can share a
code region with S/U-mode that can be RX for M-mode and just X for S/U-
mode. I'm obviously talking about small code snippets without any
dependencies and references to external symbols etc. A function that flushes
the cache for example can be written in such a way.
It's not that simple. Providing shared executable code pages from M-mode
to S-mode means:
1) We will have to define ABI for entry/exit of functions in this shared
2) Define a format of function table offset which M-mode can export to
S-mode in the shared code pages itself.


I'm not very
passionate about this, after all an ecall isn't that expensive and a
DMA sync is not an operation that happens very frequently, but maybe
it's a good opportunity to talk about this approach.

That's what I am thinking. The only additional cost is just a "ecall
and mret".

IMO, there will be noticeable difference in performance in vDSO-like
interface where S-mode is trying to read something that M-mode
provides. Thus, the base function list are likely candidates [1].
However, the OS makes those SBI calls once during the boot. Thus, it
wouldn't benefit that much.
I was thinking that as part of the extension, we can have an SBI call that would
return the address/length of the shared code region (in physical memory) and
offsets for each function within that region. The OS will do the SBI call upon
registering that SBI extension and will just use the provided function pointers
to directly execute code from the shared region. If we are looking for a
scenario with a high rate of syncs (lots of packets per second) there will be a
noticeable performance difference between a function call and an SBI call, on
the other had on such scenarios I'd expect to use the coherent API instead of
the non-coherent one.
Let's have a simple SBI DMA sync extension in SBI v0.4 spec.

The shared code pages between M-mode and S-mode will have it's own
Challenges and we will have to define more stuff in SBI spec to support
this (see above).

It seems CMO extension might freeze sooner than we think (others can
comment on this). If CMO extension is frozen by year end then we can
trap-n-emulate CMO instructions instead of SBI DMA sync extension. If
it does not freeze by year end then we will have to go ahead with
SBI DMA sync extension as stop-gap solution.

Regards,
Anup

Join tech-unixplatformspec@lists.riscv.org to automatically receive all group messages.