Re: [PATCH v1] System Peripherals - watchdog timer

Greg Favor

On Thu, Jul 8, 2021 at 8:24 PM Abner Chang <renba.chang@...> wrote:
If the watchdog timer remains un-refreshed for a second period,
+then a second-stage watchdog timeout occurs.
Does it mean the second-stage watchdog timer would be timeout 1 second after the first-stage watchdog timer has not been updated?

What the sentence is actually referring to is the following:

The first watchdog timeout is based on a first programmable timeout period, and the second watchdog timeout is based on a second (separately) programmable timeout period.

On the server platform, firmware usually updates the watchdog timer in the background even under OS in runtime. When the watchdog interrupt is triggered, the firmware takes over, sends the log to BMC, and asks BMC to either shutdown or reset the system according to scenarios. We don't like to see OS or hypervisor in the middle to delay the system recovery, for example, OS or processor is somehow halted and the temperature in the box goes pretty high rapidly.
Is 1 second too long for the crisis recovery? Should the first-stage watchdog timer be able to configure as either M or S mode interrupt? So firmware can handle the crisis immediately.

Note that we're spec'ing similar behaviors to what others, like SBSA, spec.  Also note that, in SBSA for example, there are both non-secure and secure watchdogs.  The above watchdog is equivalent to a non-secure watchdog.  Whereas I think what you're thinking of corresponds to a secure watchdog timer - which is controlled by secure software (analogous to M-mode software).  Whereas the non-secure watchdog is controlled by non-secure software, i.e. the OS or hypervisor.


Join { to automatically receive all group messages.