Re: MTIME update frequency

Darius Rad

On Wed, Nov 17, 2021 at 03:50:17PM -0600, Vedvyas Shanbhogue wrote:
On Wed, Nov 17, 2021 at 04:36:10PM -0500, Darius Rad wrote:
On Wed, Nov 17, 2021 at 04:37:21AM +0000, Anup Patel wrote:
Before we go ahead and change the MTIME resolution requirement in the platform spec, I would like to highlight following points (from past discussions) which led to mandating a fixed MTIME resolution in the platform spec:

1. The Linux kernel creates a clock source on-top-of time CSR (mirror of MMIO MTIME) with timebase-frequency discovered from DT. The generic time management in Linux kernel requires nanoseconds granularity so each value read from clock source is converted to nanoseconds using a mult and shift (i.e. nanoseconds = (time_csr * multi) >> shift)). In other words, Linux kernel always uses integer operation to convert X resolution of time CSR to 1ns resolution and this conversion will have some round-off errors. We could have mandated a fixed 1ns resolution (just like ARM SBSA) but for RISC-V we also need to honour the architectural requirement of all time CSRs to be synchronized within 1 tick (i.e. one resolution period) and for multi-sockets (or multi-die) systems it becomes challenging to synchronize multiple MTIME counters within 1ns resolution. Considering this facts, it made sense to have fixed 10ns resolution for MTIME but the update frequency could be lower than 100MHz. (@Greg Favor<mailto:gfavor@...>, please add if I missed anything)
Considering the requirement that all time CSRs be synchronized to within 1
tick, setting a fixed resolution indirectly makes synchronization much more
difficult for implementations that have a lower update frequency. For such
implementations, since each update is more than 1 tick, it would be
necessary to ensure that all time CSRs always have exactly the same value,
which is considerably more difficult than within 1 tick.
So an implementation that supports 100MHz clock would need to update the mtime by 10 on each tick to meet the 1ns granularity. In current spec a implementation that supports 10 MHz clock would need to update the mtime by 10 on each tick to meet the 10ns resolution. I am not sure incrementing by 1 vs. incrementing by 10 makes it much harder as it was already required for a system that implements a 10MHz clock.
The specification says "synchronized to within one tick".

If the time CSR is incremented one tick at a time, then, for every update,
as long as one update propagates to all harts before the next update, this
requirement is met. Some harts may show time T and some may show T+1, but
those are within one tick, which is acceptable.

Now suppose there is a system that increments 10 ticks per update. If the
time value in any hart lags, where some show T and some show T+10, this is
not within one tick, which is not acceptable.

Thus, for systems that update one tick at a time, they have a full tick to
propagate updates to all harts. For systems that update multiple ticks at
a time, they have *no* time to propagate updates to all harts; all updates
must be instantaneous on *all* harts.

CSRs need to have the consistent time at the observation point. The fastest way in most system to "observe" value in CSRs is through the cache or through memory. So the difference between the two CSRs should not be observable to the extent that the following test fails:

Hart-X (sends a message to Hart-Y):
Read time
Write timestamp message to memory
Read timestamp message from memory (gets the data from cache or memory)
Read time
compare timestamp to time (the timestamp should not be in future)

If this test is passes then the fact that the CSR in Hart-Y is 2 ticks behind or ahead of Hart-X is not observable.
That is perhaps a useful test, but not the only way to observe the clocks,
nor is it how the accuracy is specified in the architecture. The
architecture does not say the accuracy is with respect to what is
observable in another hart. The time could, for example, be observed by
something external to both harts and with higher precision.

Setting the granularity to 1 ns provides the path for RISC-V implementations
that want the fine resolution to acheive the goal without penalizing
implementations that do not.


Join { to automatically receive all group messages.