Re: MTIME update frequency

Ved Shanbhogue

On Wed, Nov 17, 2021 at 04:37:21AM +0000, Anup Patel wrote:
Before we go ahead and change the MTIME resolution requirement in the platform spec, I would like to highlight following points (from past discussions) which led to mandating a fixed MTIME resolution in the platform spec:

1. The Linux kernel creates a clock source on-top-of time CSR (mirror of MMIO MTIME) with timebase-frequency discovered from DT. The generic time management in Linux kernel requires nanoseconds granularity so each value read from clock source is converted to nanoseconds using a mult and shift (i.e. nanoseconds = (time_csr * multi) >> shift)). In other words, Linux kernel always uses integer operation to convert X resolution of time CSR to 1ns resolution and this conversion will have some round-off errors. We could have mandated a fixed 1ns resolution (just like ARM SBSA) but for RISC-V we also need to honour the architectural requirement of all time CSRs to be synchronized within 1 tick (i.e. one resolution period) and for multi-sockets (or multi-die) systems it becomes challenging to synchronize multiple MTIME counters within 1ns resolution. Considering this facts, it made sense to have fixed 10ns resolution for MTIME but the update frequency could be lower than 100MHz. (@Greg Favor<mailto:gfavor@...>, please add if I missed anything)
So synchronizing time between hart, dies or multiple sockets is an engineering problem. The architecture should not restrict the implementation to achieve the 1ns resolution. Synchronizing such counters even at much higher frequencies has been acheived in several implementations.

I find Lamport's paper is a good reference on this topic of time, clocks and ordering of events in a distributed system. The goal of such synchronization would be for the the system to acheive the property that - if an event a occurs before event b then a should happen at an earlier time than b. If that needs bounding to 1 tick or bounding to the latency of a fastest hart-to-hart transfer should be an engineering problem.

Once the resolution is determined I am not sure the upper bound on update frequency needs to be bounded.Of course its not very useful for an implementation have an update frequency higher than the resolution and implementations could pick the increments per update as the ratio between 1G and the implemented clock (instead of ration between 100M and implemented clock as in current draft).

2. It is common in enterprise clouds to migrate a Guest/VM across different hosts. Considering the diversity in RISC-V world, we have to support migration of Guest/VM from host A to host B where these hosts can be from different vendors. Now if host A and host B have different MTIME resolution then Guest/VM migration will not work and this problem also applies to ARM world. This is another reason why ARM SBSA mandates a fixed timer resolution. Although, ARM world standardized timer frequency quite late in the game but RISC-V world can avoid these problems by standardizing MTIME resolution quite early. Alternatively, there is also SBI para-virt call possible to detect change in MTIME resolution but this would mean additional code (along with barriers) in the path of reading time CSR (as an example, look at KVM para-virt clock used in x86 world).
3. Requiring a fixed MTIME resolution in the platform spec would mean that some of the existing platforms won’t comply but the platform spec is supposed to be forward looking. If we don’t standardize a fixed MTIME resolution now then it is likely that we will end-up standardizing it in future platform spec.

Based on above points, I still think mandating fixed MTIME resolution is desirable for OS-A platforms and this has nothing to do with how timebase-frequency is discovered (i.e. DT/ACPI).
If we do want to fix the resolution then the 10ns is too coarse. I suggest we make it at least 1ns to address the requirements of systems many of us are building. With 10ns resolution it will not be competitive against ARM systems that as you noted have fixed resolution to 1ns or x86 where TSC operates at P0 frequency.


Join to automatically receive all group messages.