Watchdog Spec Questions
else set S1WTO and re-initialize the timeout counter with WTOCNT"
Hi,I have some questions related to the Watchdog spec found here: https://github.com/riscv-non-isa/riscv-watchdog/blob/main/riscv-watchdog.adoc1. The spec goes to great lengths to describe the watchdog tick frequency in terms of MTIME frequency and a bit position within MTIME serving as a divider. However:"The choice of MTIME resolution and MTIME bit position for watchdog tick is platform specific and these parameters should be discoverable by software via platform-specific means. It is recommended that these parameters are chosen so as to provide a watchdog tick resolution between 0.1 sec and 1 sec, ensuring a maximum timeout period (WTOCNT=0x3FF) greater than 100 seconds."If the effective watchdog tick frequency is platform specific then it is my opinion that the only thing that should be specified as parameters for the watchdog block is its tick frequency. Why complicate it with an assumption of the backing clock when all that matters is the effective watchdog frequency. Guidance is already provided w.r.t. expectations. I suggest we simplify the spec. Existing implementations can provide their own determinism of the frequency by exposing some bit and mtime frequency, but that does not need to be true for all implementations.2. We have the following statements w.r.t. WTOCNT:"The 10-bit WTOCNT value initializes a 10-bit timeout counter.""If timeout counter is now zero thenif S1WTO=1 then set S2WTO
else set S1WTO and re-initialize the timeout counter with WTOCNT"There's a 'timeout counter' term being used. Is the intention that this timeout counter value actually be hidden? i.e. there's no way to read the timeout counter itself? If so, why is this timeout counter not a part of the register set?Thanks.-Aaron
Apologies for the delay in response, I am traveling currently.
From: <tech-unixplatformspec@...> on behalf of Aaron Durbin <adurbin@...>
Date: Friday, 8 October 2021 at 8:04 AM
To: "tech-unixplatformspec@..." <tech-unixplatformspec@...>
Subject: [RISC-V] [tech-unixplatformspec] Watchdog Spec Questions
Hi,
I have some questions related to the Watchdog spec found here: https://github.com/riscv-non-isa/riscv-watchdog/blob/main/riscv-watchdog.adoc
1. The spec goes to great lengths to describe the watchdog tick frequency in terms of MTIME frequency and a bit position within MTIME serving as a divider. However:
"The choice of MTIME resolution and MTIME bit position for watchdog tick is platform specific and these parameters should be discoverable by software via platform-specific means. It is recommended that these parameters are chosen so as to provide a watchdog tick resolution between 0.1 sec and 1 sec, ensuring a maximum timeout period (WTOCNT=0x3FF) greater than 100 seconds."
If the effective watchdog tick frequency is platform specific then it is my opinion that the only thing that should be specified as parameters for the watchdog block is its tick frequency. Why complicate it with an assumption of the backing clock when all that matters is the effective watchdog frequency. Guidance is already provided w.r.t. expectations. I suggest we simplify the spec. Existing implementations can provide their own determinism of the frequency by exposing some bit and mtime frequency, but that does not need to be true for all implementations.
[Anup] The MTIME frequency is already available to software (via DT or ACPI) so software only needs to know the MTIME bit position used for watchdog tick. This means we can either provide MTIME bit position OR effective watchdog tick frequency in DT or ACPI. We choose MTIME bit position to be available in DT or ACPI because:
- It describes the watchdog HW better compared to having software derive MTIME bit position from effective watchdog tick frequency.
- Knowing the MTIME bit position also helps software do adjustments in WTOCNT value because the first watchdog tick is mostly a partial tick since MTIME is a free running counter independent of watchdog timeout counter.
2. We have the following statements w.r.t. WTOCNT:
"The 10-bit WTOCNT value initializes a 10-bit timeout counter."
"If timeout counter is now zero then
if S1WTO=1 then set S2WTO
else set S1WTO and re-initialize the timeout counter with WTOCNT"
There's a 'timeout counter' term being used. Is the intention that this timeout counter value actually be hidden? i.e. there's no way to read the timeout counter itself? If so, why is this timeout counter not a part of the register set?
[Anup] We did not see any use-case for exposing “timeout counter” as separate registers from Linux watchdog framework perspective. If it is useful then we can certainly add read-only “timeout counter” register.
Regards,
Anup
Thanks.
-Aaron
If the effective watchdog tick frequency is platform specific then it is my opinion that the only thing that should be specified as parameters for the watchdog block is its tick frequency. Why complicate it with an assumption of the backing clock when all that matters is the effective watchdog frequency. Guidance is already provided w.r.t. expectations. I suggest we simplify the spec. Existing implementations can provide their own determinism of the frequency by exposing some bit and mtime frequency, but that does not need to be true for all implementations.
There's a 'timeout counter' term being used. Is the intention that this timeout counter value actually be hidden? i.e. there's no way to read the timeout counter itself? If so, why is this timeout counter not a part of the register set?
On Thu, Oct 7, 2021 at 7:34 PM Aaron Durbin <adurbin@...> wrote:If the effective watchdog tick frequency is platform specific then it is my opinion that the only thing that should be specified as parameters for the watchdog block is its tick frequency. Why complicate it with an assumption of the backing clock when all that matters is the effective watchdog frequency. Guidance is already provided w.r.t. expectations. I suggest we simplify the spec. Existing implementations can provide their own determinism of the frequency by exposing some bit and mtime frequency, but that does not need to be true for all implementations.I agree that all that needs to be discoverable is the watchdog tick period or frequency. Further, if a platform spec requires a specific frequency, then even that doesn't need to be discoverable.
There's a 'timeout counter' term being used. Is the intention that this timeout counter value actually be hidden? i.e. there's no way to read the timeout counter itself? If so, why is this timeout counter not a part of the register set?There is no specific intention to hide the timeout counter. There just hasgn't appeared to be a justifiable need yet to support reading of it.
Greg
Apologies for the delay in response, I am traveling currently.
From: <tech-unixplatformspec@...> on behalf of Aaron Durbin <adurbin@...>
Date: Friday, 8 October 2021 at 8:04 AM
To: "tech-unixplatformspec@..." <tech-unixplatformspec@...>
Subject: [RISC-V] [tech-unixplatformspec] Watchdog Spec Questions
Hi,
I have some questions related to the Watchdog spec found here: https://github.com/riscv-non-isa/riscv-watchdog/blob/main/riscv-watchdog.adoc
1. The spec goes to great lengths to describe the watchdog tick frequency in terms of MTIME frequency and a bit position within MTIME serving as a divider. However:
"The choice of MTIME resolution and MTIME bit position for watchdog tick is platform specific and these parameters should be discoverable by software via platform-specific means. It is recommended that these parameters are chosen so as to provide a watchdog tick resolution between 0.1 sec and 1 sec, ensuring a maximum timeout period (WTOCNT=0x3FF) greater than 100 seconds."
If the effective watchdog tick frequency is platform specific then it is my opinion that the only thing that should be specified as parameters for the watchdog block is its tick frequency. Why complicate it with an assumption of the backing clock when all that matters is the effective watchdog frequency. Guidance is already provided w.r.t. expectations. I suggest we simplify the spec. Existing implementations can provide their own determinism of the frequency by exposing some bit and mtime frequency, but that does not need to be true for all implementations.
[Anup] The MTIME frequency is already available to software (via DT or ACPI) so software only needs to know the MTIME bit position used for watchdog tick. This means we can either provide MTIME bit position OR effective watchdog tick frequency in DT or ACPI. We choose MTIME bit position to be available in DT or ACPI because:
- It describes the watchdog HW better compared to having software derive MTIME bit position from effective watchdog tick frequency.
- Knowing the MTIME bit position also helps software do adjustments in WTOCNT value because the first watchdog tick is mostly a partial tick since MTIME is a free running counter independent of watchdog timeout counter.
2. We have the following statements w.r.t. WTOCNT:
"The 10-bit WTOCNT value initializes a 10-bit timeout counter."
"If timeout counter is now zero then
if S1WTO=1 then set S2WTO
else set S1WTO and re-initialize the timeout counter with WTOCNT"
There's a 'timeout counter' term being used. Is the intention that this timeout counter value actually be hidden? i.e. there's no way to read the timeout counter itself? If so, why is this timeout counter not a part of the register set?
[Anup] We did not see any use-case for exposing “timeout counter” as separate registers from Linux watchdog framework perspective. If it is useful then we can certainly add read-only “timeout counter” register.
Regards,
Anup
Thanks.
-Aaron
There is no specific intention to hide the timeout counter. There just hasgn't appeared to be a justifiable need yet to support reading of it.I think it's informative in that one can read the current timeout counter value and observe it tick. That has been useful in the past from my experience.
On Thu, Oct 14, 2021 at 5:36 AM Aaron Durbin <adurbin@...> wrote:There is no specific intention to hide the timeout counter. There just hasgn't appeared to be a justifiable need yet to support reading of it.I think it's informative in that one can read the current timeout counter value and observe it tick. That has been useful in the past from my experience.This raises the question of how to represent or virtualize the current timeout counter value (with, of course, trap and emulate by a hypervisor) - as well as now needing to do this. I've heard some people argue for a clean "one-way" (write-only) interface by an OS to a watchdog for this reason.Greg
I don't think doing trap-and-emulate to handle reads of the watchdog timer would be much of a problem for a hypervisor. That functionality would be needed for emulating almost any other common device. To be somewhat flippant, if someone is thinking of emulating this watchdog timer and it is anywhere close to the most complex device they're emulating, then something is probably going seriously wrong.
On Thu, Oct 14, 2021 at 10:14 AM Jonathan Behrens <behrensj@... <mailto:behrensj@...>> wrote:I am missing why M-mode needs to trap and emulate a virtual watchdog device access from a guest OS. That should be a hypervisor function.
I don't think doing trap-and-emulate to handle reads of the watchdog
timer would be much of a problem for a hypervisor. That
functionality would be needed for emulating almost any other common
device. To be somewhat flippant, if someone is thinking of emulating
this watchdog timer and it is anywhere close to the most complex
device they're emulating, then something is probably going seriously
wrong.
I agree that M-mode software could emulate the idea of a counter and its continual changes in value, but that's just added software complexity that doesn't seem warranted.
Also, T&E to M-mode raises concerns about a dependency on M-mode still being healthy while S-mode is not. In some cases, M-mode will still be healthy; in other cases M and S modes have become unhealthy and the watchdog functionality becomes disabled for the very cases that it is intended to be functional for.
A hypervisor that passes through a virtual watchdog device to its guest has to trap and emulate the virtual watchdog device anyway - including having a virtual watchdog timer that is emulated by the hypervisor.
regards
ved
I am missing why M-mode needs to trap and emulate a virtual watchdog
device access from a guest OS. That should be a hypervisor function.
A hypervisor that passes through a virtual watchdog device to its guest
has to trap and emulate the virtual watchdog device anyway - including
having a virtual watchdog timer that is emulated by the hypervisor.
On Thu, Oct 14, 2021 at 12:32 PM Vedvyas Shanbhogue <ved@... <mailto:ved@...>> wrote:Thanks. Sorry, I missed that your response was orthogonal and was not a response to Jonathon's comment that its not a problem for the hypervisor to emulate.
I am missing why M-mode needs to trap and emulate a virtual watchdog
device access from a guest OS. That should be a hypervisor function.
I was referring T&E of a host watchdog (from S/HS-mode to M-mode).
I agree with you that I don't see under what circumstance M-mode would want to trap and emulate a host watchdog so M-mode complexity to T&E should be orthogonal to this topic.
A hypervisor that passes through a virtual watchdog device to its guestI agree. It's an emulation and is only so good as the code written for emulation. Its not a hard device to emulate however considering everything else a hypervisor may need to emulate as Jonathon said earlier.
has to trap and emulate the virtual watchdog device anyway - including
having a virtual watchdog timer that is emulated by the hypervisor.
That isn't so clear as being the case in some or all hypervisors (i.e. that the watchdog device will be faithfully emulated). Just like all the power management stuff that a guest OS does is not simply 1-for-1 emulated literally by a hypervisor.
regards
ved
On 10/14/21 2:47 PM, Greg Favor wrote:Will the watchdog timer have hardware support for scaling and offsettingOn Thu, Oct 14, 2021 at 12:32 PM Vedvyas Shanbhogue <ved@...Thanks. Sorry, I missed that your response was orthogonal and was not a
<mailto:ved@...>> wrote:
I am missing why M-mode needs to trap and emulate a virtual watchdog
device access from a guest OS. That should be a hypervisor function.
I was referring T&E of a host watchdog (from S/HS-mode to M-mode).
response to Jonathon's comment that its not a problem for the hypervisor to
emulate.
I agree with you that I don't see under what circumstance M-mode would want
to trap and emulate a host watchdog so M-mode complexity to T&E should be
orthogonal to this topic.A hypervisor that passes through a virtual watchdog device to its guestI agree. It's an emulation and is only so good as the code written for
has to trap and emulate the virtual watchdog device anyway - including
having a virtual watchdog timer that is emulated by the hypervisor.
That isn't so clear as being the case in some or all hypervisors (i.e.
that the watchdog device will be faithfully emulated). Just like all
the power management stuff that a guest OS does is not simply 1-for-1
emulated literally by a hypervisor.
emulation. Its not a hard device to emulate however considering everything
else a hypervisor may need to emulate as Jonathon said earlier.
the virtual watchdog timer? Guest timekeeping is quite complex when one
considers a guest may be paused at any time and migrated to other hosts
which have different clock frequencies.
Thanks,
drew
regards
ved
Is it intended/required that S1WTO and S2WTO be literal interrupts? In particular, it might be desirable for S2WTO to actually be an NMI or reset to recover a system that is not healthy enough to handle the S1WTO.
The watchdog spec does not provide a means to configure an MSI
On Thu, Oct 28, 2021 at 12:20 PM Phil McCoy via lists.riscv.org <pnm=computer.org@...> wrote:
Is it intended/required that S1WTO and S2WTO be literal interrupts? In particular, it might be desirable for S2WTO to actually be an NMI or reset to recover a system that is not healthy enough to handle the S1WTO.
I agree. The text needs to describe the two timeout signals in more generic terms as simply signals indicating the occurrence of a timeout. It is up to a platform or implementation to decide what it does with those signals. (Typically the first-stage timeout would be a hart interrupt request directed to the OS to give it a chance to recover or gracefully react to the timeout condition, and the second-stage timeout would go somewhere else (whether as an "interrupt request", NMI, or whatever), e.g. to M-mode on a hart, or a platform microcontroller, or a hardware system block, or a BMC, or ...).
destination for the S1WTO. Is that planned/discussed?
regards
ved
Is it intended/required that S1WTO and S2WTO be literal interrupts? In particular, it might be desirable for S2WTO to actually be an NMI or reset to recover a system that is not healthy enough to handle the S1WTO.I agree. The text needs to describe the two timeout signals in more generic terms as simply signals indicating the occurrence of a timeout. It is up to a platform or implementation to decide what it does with those signals. (Typically the first-stage timeout would be a hart interrupt request directed to the OS to give it a chance to recover or gracefully react to the timeout condition, and the second-stage timeout would go somewhere else (whether as an "interrupt request", NMI, or whatever), e.g. to M-mode on a hart, or a platform microcontroller, or a hardware system block, or a BMC, or ...).Greg
The watchdog spec does not provide a means to configure an MSI
destination for the S1WTO. Is that planned/discussed?
Even the first-stage timeout would probably be more useful to the OS if it was a "non-maskable interrupt" or otherwise able to arrive even with sstatus.SIE bit unset.
If the system has locked up to the point that the watchdog timer is expiring, that probably means that normal timer interrupts aren't arriving to the OS. And that could be caused either by stimecmp being unset/configured wrong (unlikely) or because the OS has interrupts disabled and is either blocking or entered an infinite loop.
On Thu, Oct 14, 2021 at 03:00:30PM -0500, Vedvyas Shanbhogue wrote:
> On 10/14/21 2:47 PM, Greg Favor wrote:
> > On Thu, Oct 14, 2021 at 12:32 PM Vedvyas Shanbhogue <ved@...
> > <mailto:ved@...>> wrote:
> >
> > I am missing why M-mode needs to trap and emulate a virtual watchdog
> > device access from a guest OS. That should be a hypervisor function.
> >
> >
> > I was referring T&E of a host watchdog (from S/HS-mode to M-mode).
> >
> Thanks. Sorry, I missed that your response was orthogonal and was not a
> response to Jonathon's comment that its not a problem for the hypervisor to
> emulate.
>
> I agree with you that I don't see under what circumstance M-mode would want
> to trap and emulate a host watchdog so M-mode complexity to T&E should be
> orthogonal to this topic.
>
>
> > A hypervisor that passes through a virtual watchdog device to its guest
> > has to trap and emulate the virtual watchdog device anyway - including
> > having a virtual watchdog timer that is emulated by the hypervisor.
> >
> >
> > That isn't so clear as being the case in some or all hypervisors (i.e.
> > that the watchdog device will be faithfully emulated). Just like all
> > the power management stuff that a guest OS does is not simply 1-for-1
> > emulated literally by a hypervisor.
> >
> I agree. It's an emulation and is only so good as the code written for
> emulation. Its not a hard device to emulate however considering everything
> else a hypervisor may need to emulate as Jonathon said earlier.
Will the watchdog timer have hardware support for scaling and offsetting
the virtual watchdog timer? Guest timekeeping is quite complex when one
considers a guest may be paused at any time and migrated to other hosts
which have different clock frequencies.
[Anup] Apologies for missing this questions.
[Anup] The watchdog timer does not define any scaling and offsetting
support for virtualization because current expectation is that hypervisors
will emulate Guest/VM watchdog totally in software.
[Anup] As suggested by in this email thread, it's better to expose only
effective watchdog tick frequency via DT/ACPI. This will further simplify
Guest/VM migration because hypervisors can emulate a fixed watchdog
tick frequency on both source and destination hosts irrespective of the
MTIME frequency on these hosts.
Regards,
Anup
Thanks,
drew
>
> regards
> ved
>
>
>
>
>
>