OS-A platform stoptime requirement


Beeman Strong
 

Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

thanks,
beeman


andrew@...
 



On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Greg Favor
 

I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


andrew@...
 



On Mon, Dec 20, 2021 at 3:42 PM Greg Favor <gfavor@...> wrote:
I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Oops, it has been a while since I've read this spec.  I withdraw my comment, if it's indeed the case that shared implementations of mtime need not be affected by stoptime.


Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Beeman Strong
 

Thanks, I definitely misunderstood the intent.  So the expectation is that, in Debug Mode, reads to mtime will see time continue to progress, but reads to the time CSR will see a frozen value.  Reads of the time CSR by software running outside debug mode should not be impacted, and will see a value synchronized with mtime.

I suppose I can imagine usages where keeping the time CSR frozen has value to a debugger, but it does add complexity and latency in requiring a resync with mtime on debug mode exit.  Does the value really rise to the level of being a platform requirement?  Is there some important debug functionality that breaks if we keep it simple and let the time CSR keep running in debug mode?

On Mon, Dec 20, 2021 at 2:05 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 3:42 PM Greg Favor <gfavor@...> wrote:
I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Oops, it has been a while since I've read this spec.  I withdraw my comment, if it's indeed the case that shared implementations of mtime need not be affected by stoptime.


Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Greg Favor
 

I'm cc'ing Paul Donahue (vice-chair of the Debug TG).  He was involved with distilling out of the enormous amount of optionality in the Debug spec, what would be suitable to require in OS-A platforms.  So he can comment about this debug-related OS-A platform requirement, and in particular the stoptime requirement (Paul, see the email thread included down below):
       dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1

Btw, I don't see resync of time with mtime as more than a relatively trivial exercise on debug mode exit.  Outside of debug mode mtime is being broadcast to all harts and each hart's time CSR updates with the latest time value that it receives.  In debug mode, if stoptime=1, then the time flops are simply inhibited from updating with any new received mtime values.  Then when debug mode is exited and the inhibit goes away, the time flops naturally go back to getting updated with the latest and/or new received mtime values.

Greg

On Mon, Dec 20, 2021 at 3:19 PM Beeman Strong <beeman@...> wrote:
Thanks, I definitely misunderstood the intent.  So the expectation is that, in Debug Mode, reads to mtime will see time continue to progress, but reads to the time CSR will see a frozen value.  Reads of the time CSR by software running outside debug mode should not be impacted, and will see a value synchronized with mtime.

I suppose I can imagine usages where keeping the time CSR frozen has value to a debugger, but it does add complexity and latency in requiring a resync with mtime on debug mode exit.  Does the value really rise to the level of being a platform requirement?  Is there some important debug functionality that breaks if we keep it simple and let the time CSR keep running in debug mode?

On Mon, Dec 20, 2021 at 2:05 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 3:42 PM Greg Favor <gfavor@...> wrote:
I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Oops, it has been a while since I've read this spec.  I withdraw my comment, if it's indeed the case that shared implementations of mtime need not be affected by stoptime.


Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Allen Baum
 

What you describe sounds very implementation dependent; I had always imagined that mtime would not be broadcast, but an mtime count enable bit would be, to keep the local copy synched.
That has its own issues of course (synching at reset, whenever mtime is written, and whenever stoptime is released) - though they're all the same mechanism, and can reuse whatever is used for reading mtime.

On Mon, Dec 20, 2021 at 6:35 PM Greg Favor <gfavor@...> wrote:
I'm cc'ing Paul Donahue (vice-chair of the Debug TG).  He was involved with distilling out of the enormous amount of optionality in the Debug spec, what would be suitable to require in OS-A platforms.  So he can comment about this debug-related OS-A platform requirement, and in particular the stoptime requirement (Paul, see the email thread included down below):
       dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1

Btw, I don't see resync of time with mtime as more than a relatively trivial exercise on debug mode exit.  Outside of debug mode mtime is being broadcast to all harts and each hart's time CSR updates with the latest time value that it receives.  In debug mode, if stoptime=1, then the time flops are simply inhibited from updating with any new received mtime values.  Then when debug mode is exited and the inhibit goes away, the time flops naturally go back to getting updated with the latest and/or new received mtime values.

Greg

On Mon, Dec 20, 2021 at 3:19 PM Beeman Strong <beeman@...> wrote:
Thanks, I definitely misunderstood the intent.  So the expectation is that, in Debug Mode, reads to mtime will see time continue to progress, but reads to the time CSR will see a frozen value.  Reads of the time CSR by software running outside debug mode should not be impacted, and will see a value synchronized with mtime.

I suppose I can imagine usages where keeping the time CSR frozen has value to a debugger, but it does add complexity and latency in requiring a resync with mtime on debug mode exit.  Does the value really rise to the level of being a platform requirement?  Is there some important debug functionality that breaks if we keep it simple and let the time CSR keep running in debug mode?

On Mon, Dec 20, 2021 at 2:05 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 3:42 PM Greg Favor <gfavor@...> wrote:
I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Oops, it has been a while since I've read this spec.  I withdraw my comment, if it's indeed the case that shared implementations of mtime need not be affected by stoptime.


Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Greg Favor
 

On Tue, Dec 21, 2021 at 12:22 AM Allen Baum <allen.baum@...> wrote:
What you describe sounds very implementation dependent; I had always imagined that mtime would not be broadcast, but an mtime count enable bit would be, to keep the local copy synched.
That has its own issues of course (synching at reset, whenever mtime is written, and whenever stoptime is released) - though they're all the same mechanism, and can reuse whatever is used for reading mtime.

And also re-sync'ing when coming out of deeper power management sleep states.

The mechanism for software reading mtime is memory-mapped register reads; ditto for trap-and-emulate of hardware reads of the time CSR; and "hardware broadcast" of mtime to time otherwise.  Obviously only the latter time CSR implementation has to deal with resync issues.  While some systems may be able to avoid having any and all reasons for needing occasional time resync, many systems for one or more reasons will need occasional time resync.

I've seen many people (including ARM time distribution IP) do a hybrid between just sending an "increment" pulse and broadcasting a full 64-bit value - that supports periodic full resync while using just a small number of wires to also communicate the increments.  (One can potentially squeeze this down to two wires, although designs I've seen don't go that far.)

Greg


Ved Shanbhogue
 

So there is an assumption here that somehow time is broadcast and not the clock. For an implementation that does clock broadcast this requirement requires having a shadow time that counts while software visible time is frozen. All that complexity may be totally justified but is not obvious why . Besides if the time does not stop for a shared implementation of mtime then this does not seem like is something fundamentally required for debug.

Regards
Ved


On Tue, Dec 21, 2021 at 4:45 AM Greg Favor <gfavor@...> wrote:
On Tue, Dec 21, 2021 at 12:22 AM Allen Baum <allen.baum@...> wrote:
What you describe sounds very implementation dependent; I had always imagined that mtime would not be broadcast, but an mtime count enable bit would be, to keep the local copy synched.
That has its own issues of course (synching at reset, whenever mtime is written, and whenever stoptime is released) - though they're all the same mechanism, and can reuse whatever is used for reading mtime.

And also re-sync'ing when coming out of deeper power management sleep states.

The mechanism for software reading mtime is memory-mapped register reads; ditto for trap-and-emulate of hardware reads of the time CSR; and "hardware broadcast" of mtime to time otherwise.  Obviously only the latter time CSR implementation has to deal with resync issues.  While some systems may be able to avoid having any and all reasons for needing occasional time resync, many systems for one or more reasons will need occasional time resync.

I've seen many people (including ARM time distribution IP) do a hybrid between just sending an "increment" pulse and broadcasting a full 64-bit value - that supports periodic full resync while using just a small number of wires to also communicate the increments.  (One can potentially squeeze this down to two wires, although designs I've seen don't go that far.)

Greg


Greg Favor
 

On Tue, Dec 21, 2021 at 5:08 AM Vedvyas Shanbhogue <ved@...> wrote:
So there is an assumption here that somehow time is broadcast and not the clock.

Architecturally there is just the memory-mapped MTIME register and the time CSRs - where time is a (delayed) copy of MTIME.  How MTIME values get into the time CSRs is implementation-specific (with many ways to do this).  To avoid ambiguity, by "clock" you are referring to what?
 
For an implementation that does clock broadcast this requirement requires having a shadow time that counts while software visible time is frozen. All that complexity may be totally justified but is not obvious why . Besides if the time does not stop for a shared implementation of mtime then this does not seem like is something fundamentally required for debug.

Regards
Ved


On Tue, Dec 21, 2021 at 4:45 AM Greg Favor <gfavor@...> wrote:
On Tue, Dec 21, 2021 at 12:22 AM Allen Baum <allen.baum@...> wrote:
What you describe sounds very implementation dependent; I had always imagined that mtime would not be broadcast, but an mtime count enable bit would be, to keep the local copy synched.
That has its own issues of course (synching at reset, whenever mtime is written, and whenever stoptime is released) - though they're all the same mechanism, and can reuse whatever is used for reading mtime.

And also re-sync'ing when coming out of deeper power management sleep states.

The mechanism for software reading mtime is memory-mapped register reads; ditto for trap-and-emulate of hardware reads of the time CSR; and "hardware broadcast" of mtime to time otherwise.  Obviously only the latter time CSR implementation has to deal with resync issues.  While some systems may be able to avoid having any and all reasons for needing occasional time resync, many systems for one or more reasons will need occasional time resync.

I've seen many people (including ARM time distribution IP) do a hybrid between just sending an "increment" pulse and broadcasting a full 64-bit value - that supports periodic full resync while using just a small number of wires to also communicate the increments.  (One can potentially squeeze this down to two wires, although designs I've seen don't go that far.)

Greg


Ved Shanbhogue
 

Greg HI
I agree architecturally there is just memory-mapped MTIME. We can
leave it at that. What I meant by clock was where a each hart has its
unique memory mapped MTIME and thereby is clocked by a reference clock
that is broadcast. So there is no MTIME bus snaking around the chip
feeding the time CSR. There are many ways to do this so it is not
material.

So there is real value to stopping time for debug and expectation is
that there will be a "synchronization"/"catch back up" action on MRET
from debug mode?

regards
ved

On Tue, Dec 21, 2021 at 1:03 PM Greg Favor <gfavor@...> wrote:

On Tue, Dec 21, 2021 at 5:08 AM Vedvyas Shanbhogue <ved@...> wrote:

So there is an assumption here that somehow time is broadcast and not the clock.

Architecturally there is just the memory-mapped MTIME register and the time CSRs - where time is a (delayed) copy of MTIME. How MTIME values get into the time CSRs is implementation-specific (with many ways to do this). To avoid ambiguity, by "clock" you are referring to what?


For an implementation that does clock broadcast this requirement requires having a shadow time that counts while software visible time is frozen. All that complexity may be totally justified but is not obvious why . Besides if the time does not stop for a shared implementation of mtime then this does not seem like is something fundamentally required for debug.

Regards
Ved


On Tue, Dec 21, 2021 at 4:45 AM Greg Favor <gfavor@...> wrote:

On Tue, Dec 21, 2021 at 12:22 AM Allen Baum <allen.baum@...> wrote:

What you describe sounds very implementation dependent; I had always imagined that mtime would not be broadcast, but an mtime count enable bit would be, to keep the local copy synched.
That has its own issues of course (synching at reset, whenever mtime is written, and whenever stoptime is released) - though they're all the same mechanism, and can reuse whatever is used for reading mtime.

And also re-sync'ing when coming out of deeper power management sleep states.

The mechanism for software reading mtime is memory-mapped register reads; ditto for trap-and-emulate of hardware reads of the time CSR; and "hardware broadcast" of mtime to time otherwise. Obviously only the latter time CSR implementation has to deal with resync issues. While some systems may be able to avoid having any and all reasons for needing occasional time resync, many systems for one or more reasons will need occasional time resync.

I've seen many people (including ARM time distribution IP) do a hybrid between just sending an "increment" pulse and broadcasting a full 64-bit value - that supports periodic full resync while using just a small number of wires to also communicate the increments. (One can potentially squeeze this down to two wires, although designs I've seen don't go that far.)

Greg


Greg Favor
 

On Tue, Dec 21, 2021 at 9:18 AM Vedvyas Shanbhogue <ved@...> wrote:
I agree architecturally there is just memory-mapped MTIME. We can
leave it at that. What I meant by clock was where a each hart has its
unique memory mapped MTIME and thereby is clocked by a reference clock
that is broadcast.

Just fwiw, most multi-hart RV implementations have a shared MTIME register.  (CLINT, for example, reflects this.)
 
So there is real value to stopping time for debug and expectation is
that there will be a "synchronization"/"catch back up" action on MRET
from debug mode?

I'll resend the original question to both chairs of the Debug TG so they can comment on why it can be desirable to stop the local time CSR while in Debug mode.

Greg


Greg Favor
 

I'm cc'ing Tim Newsome and Paul Donahue (chairs of the Debug TG).  

Tim or Paul can comment on the debug value in sometimes being able to stop the local hart time CSR from advancing while in Debug mode (using dcsr.stoptime).

Also, Paul was involved with distilling out of the enormous amount of optionality in the Debug spec, what would be suitable to require in OS-A platforms.  So he can comment about the following debug-related OS-A platform requirement, and in particular the stoptime requirement:
       dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1

Greg

On Mon, Dec 20, 2021 at 3:19 PM Beeman Strong <beeman@...> wrote:
Thanks, I definitely misunderstood the intent.  So the expectation is that, in Debug Mode, reads to mtime will see time continue to progress, but reads to the time CSR will see a frozen value.  Reads of the time CSR by software running outside debug mode should not be impacted, and will see a value synchronized with mtime.

I suppose I can imagine usages where keeping the time CSR frozen has value to a debugger, but it does add complexity and latency in requiring a resync with mtime on debug mode exit.  Does the value really rise to the level of being a platform requirement?  Is there some important debug functionality that breaks if we keep it simple and let the time CSR keep running in debug mode?

On Mon, Dec 20, 2021 at 2:05 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 3:42 PM Greg Favor <gfavor@...> wrote:
I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Oops, it has been a while since I've read this spec.  I withdraw my comment, if it's indeed the case that shared implementations of mtime need not be affected by stoptime.


Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Greg Favor
 

On Tue, Dec 21, 2021 at 9:18 AM Vedvyas Shanbhogue <ved@...> wrote:
So there is real value to stopping time for debug and expectation is
that there will be a "synchronization"/"catch back up" action on MRET
from debug mode?

Btw, that would be a DRET to exit Debug mode (not an MRET).

Greg
 


Paul Donahue
 

I agree that the stoptime=1 requirement should be removed.  I can think of a case where stoptime might be useful but it would be a contrived example that's not useful in the real world.

I should have proposed removing this from the platform spec when I made this comment about discouraging stoptime=1:


Thanks,

-Paul


On Tue, Dec 21, 2021 at 9:59 AM Greg Favor <gfavor@...> wrote:
On Tue, Dec 21, 2021 at 9:18 AM Vedvyas Shanbhogue <ved@...> wrote:
So there is real value to stopping time for debug and expectation is
that there will be a "synchronization"/"catch back up" action on MRET
from debug mode?

Btw, that would be a DRET to exit Debug mode (not an MRET).

Greg
 


Tim Newsome
 

stoptime is nice on single-hart systems, but not really practical in multi-hart systems where one hart can be running while another is halted. The main benefit is that it allows you to single step through code with a timer interrupt enabled, without going into that timer interrupt every time you single step.

Tim

On Tue, Dec 21, 2021 at 9:57 AM Greg Favor <gfavor@...> wrote:
I'm cc'ing Tim Newsome and Paul Donahue (chairs of the Debug TG).  

Tim or Paul can comment on the debug value in sometimes being able to stop the local hart time CSR from advancing while in Debug mode (using dcsr.stoptime).

Also, Paul was involved with distilling out of the enormous amount of optionality in the Debug spec, what would be suitable to require in OS-A platforms.  So he can comment about the following debug-related OS-A platform requirement, and in particular the stoptime requirement:
       dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1

Greg

On Mon, Dec 20, 2021 at 3:19 PM Beeman Strong <beeman@...> wrote:
Thanks, I definitely misunderstood the intent.  So the expectation is that, in Debug Mode, reads to mtime will see time continue to progress, but reads to the time CSR will see a frozen value.  Reads of the time CSR by software running outside debug mode should not be impacted, and will see a value synchronized with mtime.

I suppose I can imagine usages where keeping the time CSR frozen has value to a debugger, but it does add complexity and latency in requiring a resync with mtime on debug mode exit.  Does the value really rise to the level of being a platform requirement?  Is there some important debug functionality that breaks if we keep it simple and let the time CSR keep running in debug mode?

On Mon, Dec 20, 2021 at 2:05 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 3:42 PM Greg Favor <gfavor@...> wrote:
I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Oops, it has been a while since I've read this spec.  I withdraw my comment, if it's indeed the case that shared implementations of mtime need not be affected by stoptime.


Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Ved Shanbhogue
 

On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
stoptime is nice on single-hart systems, but not really practical in
multi-hart systems where one hart can be running while another is halted.
The main benefit is that it allows you to single step through code with a
timer interrupt enabled, without going into that timer interrupt every time
you single step.
Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

regards
ved


Tim Newsome
 

On Tue, Dec 28, 2021 at 4:19 PM Ved Shanbhogue <ved@...> wrote:
On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
>stoptime is nice on single-hart systems, but not really practical in
>multi-hart systems where one hart can be running while another is halted.
>The main benefit is that it allows you to single step through code with a
>timer interrupt enabled, without going into that timer interrupt every time
>you single step.

Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

It works fine, but it's not as nice as the system itself slowing down to your debugging speed. (Although slowing the system down will generally be imperfect in any case, because systems have other peripherals that will not stop generating interrupts/counting time/whatever.)

Tim 


Beeman Strong
 

Back to the original topic, it seems there is broad agreement that stoptime=1 shouldn't be a requirement for the OS-A (or server) platform spec.  Is there a formal mechanism by which an issue should be filed to get that changed?


On Wed, Dec 29, 2021 at 9:44 AM Tim Newsome <tim@...> wrote:
On Tue, Dec 28, 2021 at 4:19 PM Ved Shanbhogue <ved@...> wrote:
On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
>stoptime is nice on single-hart systems, but not really practical in
>multi-hart systems where one hart can be running while another is halted.
>The main benefit is that it allows you to single step through code with a
>timer interrupt enabled, without going into that timer interrupt every time
>you single step.

Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

It works fine, but it's not as nice as the system itself slowing down to your debugging speed. (Although slowing the system down will generally be imperfect in any case, because systems have other peripherals that will not stop generating interrupts/counting time/whatever.)

Tim 


atishp@...
 



On Mon, Jan 3, 2022 at 9:43 AM Beeman Strong <beeman@...> wrote:
Back to the original topic, it seems there is broad agreement that stoptime=1 shouldn't be a requirement for the OS-A (or server) platform spec.  Is there a formal mechanism by which an issue should be filed to get that changed?

Send a patch :)


On Wed, Dec 29, 2021 at 9:44 AM Tim Newsome <tim@...> wrote:
On Tue, Dec 28, 2021 at 4:19 PM Ved Shanbhogue <ved@...> wrote:
On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
>stoptime is nice on single-hart systems, but not really practical in
>multi-hart systems where one hart can be running while another is halted.
>The main benefit is that it allows you to single step through code with a
>timer interrupt enabled, without going into that timer interrupt every time
>you single step.

Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

It works fine, but it's not as nice as the system itself slowing down to your debugging speed. (Although slowing the system down will generally be imperfect in any case, because systems have other peripherals that will not stop generating interrupts/counting time/whatever.)

Tim