Re: Watchdog timer per hart?

Aaron Durbin

On Wed, Mar 2, 2022 at 12:35 AM James Robinson <jrobinson@...> wrote:
Hi Greg,

Thanks for your response. I'm not sure if I'm missing something about there being a connection between having a supervisor level watchdog timer and having a timer per hart, but I wasn't particularly imagining a distinction between machine and supervisor mode watch dog timers. I'll repose the question I was thinking about:

Suppose I have a system containing 16 harts. Should I have a separate WDCSR memory mapped register and associated counter for each of the 16 harts, with each counter directing an interrupt to its associated hart if it is not reset before the timeout expires? Or should I have one WDCSR memory mapped register and associated counter for the whole system, with the interrupt directed to one specific hart, and that hart being responsible for responding to a lack of timer update?

If one is operating the machine with 16 harts without any sharding or partitioning, I don't see why one would need a watchdog per hart. System watchdogs, or TCO timers from other architecture's parlance, are for system use. Now a core would normally have it's own watchdog for instruction retirement forward progress purposes, but that's a completely different use-case than the intention of a system level watchdog.

As for Greg's question about putting that in OS-A SEE or a Platform itself, I'm open to suggestions. However, my initial thinking is that it would be deferred to a Platform. The thinking is that OS-A SEE is about targeting SW expectations for the kernel. Kernels are really good about runtime binding of drivers based on the presence of hardware so I'm not overly inclined to mandate such things. That said, I'd be open to hear other opinions.


Join to automatically receive all group messages.