Re: MTIME update frequency

Ved Shanbhogue

On Fri, Nov 19, 2021 at 10:36:18AM -0500, Darius Rad wrote:
On Fri, Nov 19, 2021 at 08:59:32AM -0600, Vedvyas Shanbhogue wrote:
On Wed, Nov 17, 2021 at 04:37:12PM -0600, Vedvyas Shanbhogue via wrote:
On Wed, Nov 17, 2021 at 05:30:05PM -0500, Darius Rad wrote:
On Wed, Nov 17, 2021 at 03:50:17PM -0600, Vedvyas Shanbhogue wrote:
On Wed, Nov 17, 2021 at 04:36:10PM -0500, Darius Rad wrote:
On Wed, Nov 17, 2021 at 04:37:21AM +0000, Anup Patel wrote:
Before we go ahead and change the MTIME resolution requirement in the platform spec, I would like to highlight following points (from past discussions) which led to mandating a fixed MTIME resolution in the platform spec:

1. The Linux kernel creates a clock source on-top-of time CSR (mirror of MMIO MTIME) with timebase-frequency discovered from DT. The generic time management in Linux kernel requires nanoseconds granularity so each value read from clock source is converted to nanoseconds using a mult and shift (i.e. nanoseconds = (time_csr * multi) >> shift)). In other words, Linux kernel always uses integer operation to convert X resolution of time CSR to 1ns resolution and this conversion will have some round-off errors. We could have mandated a fixed 1ns resolution (just like ARM SBSA) but for RISC-V we also need to honour the architectural requirement of all time CSRs to be synchronized within 1 tick (i.e. one resolution period) and for multi-sockets (or multi-die) systems it becomes challenging to synchronize multiple MTIME counters within 1ns resolution. Considering this facts, it made sense to have fixed 10ns resolution for MTIME but the update frequency could be lower than 100MHz. (@Greg Favor<mailto:gfavor@...>, please add if I missed anything)
Considering the requirement that all time CSRs be synchronized to within 1
tick, setting a fixed resolution indirectly makes synchronization much more
difficult for implementations that have a lower update frequency. For such
implementations, since each update is more than 1 tick, it would be
necessary to ensure that all time CSRs always have exactly the same value,
which is considerably more difficult than within 1 tick.
So an implementation that supports 100MHz clock would need to update the mtime by 10 on each tick to meet the 1ns granularity. In current spec a implementation that supports 10 MHz clock would need to update the mtime by 10 on each tick to meet the 10ns resolution. I am not sure incrementing by 1 vs. incrementing by 10 makes it much harder as it was already required for a system that implements a 10MHz clock.
The specification says "synchronized to within one tick".

If the time CSR is incremented one tick at a time, then, for every update,
as long as one update propagates to all harts before the next update, this
requirement is met. Some harts may show time T and some may show T+1, but
those are within one tick, which is acceptable.

Now suppose there is a system that increments 10 ticks per update. If the
time value in any hart lags, where some show T and some show T+10, this is
not within one tick, which is not acceptable.

Thus, for systems that update one tick at a time, they have a full tick to
propagate updates to all harts. For systems that update multiple ticks at
a time, they have *no* time to propagate updates to all harts; all updates
must be instantaneous on *all* harts.
I agree. But that requirement is there already for a system that implements a 10 Mhz clock and has to suport 10ns resolution. If 10ns vs. 1ns is a really sticking point could we have two values supported by the platform specification - similar to the range of update frequencies supported. We can leave it to the end user whether they want to put these two classes of systems in one migration pool.
Did that address your concern? Should we update to supporting 1 ns resolution or not disallowing a 1 ns resolution?
No, it does not address my concern.

As you said, that requirement was already there, and my concern applies to
the requirement as is, although increasing the resolution will exacerbate
the problem.

When the specification mandates a specific update resolution, multihart
implementations will have to choose between (1) updating one tick at a
time, or (2) very stringent synchronization between all harts. I suspect
implementations will be forced to choose (1), even if they otherwise would
have preferred to have a different update frequency or clock. Or they will
simply ignore the synchronization requirement in the ISA specification,
which I think is a real possibility.

The current requirement effectively forces implementations to have a 100
MHz clock. If the resolution is changed to 1 ns, that becomes a 1 GHz
clock that implementations are forced to have. Either way, this violates
the Platform Policy that says platforms should not mandate performance, but
certainly the higher clock rate is more of a burden.
The current specification says that the MTIME values seen by two HARTs to be not apart by more than 1. It does say the MTIME must always increment in units of 1. I do not think the specification mandates incrementing by one on each clock tick. Presently it says the tick - the update frequency - can be 100 MHz or 10 MHz or somewhere in between. If the update frequency is 10 MHz then the MTIME increment per clock must be 10. If the update frequency is 100 Mhz then MTIME increment per clock is 1. So is your concern is that an adder that adds 10 to MTIME per clock tick is hard?


Join to automatically receive all group messages.