sys/ztimer doc: List prerequisites for successful use of ztimer_now#17614
sys/ztimer doc: List prerequisites for successful use of ztimer_now#17614chrysn merged 1 commit intoRIOT-OS:masterfrom
Conversation
kfessel
left a comment
There was a problem hiding this comment.
I like this write up i shows how to handle ztimer_now.
Even if another mechanism is introduced that keeps the clock running the documentation needs to clarify how to do that.
|
https://gist.github.com/kfessel/9beb45ab4efe2be5e4317757306d3308 those are two sets of functions one forever set |
|
https://gist.github.com/kfessel/9beb45ab4efe2be5e4317757306d3308
may solve many ztimer now cases
How is that different from ztimer64? (Which, like this, keeps the
underlying timer always active and thus has the same justification to
use compared ztimer_now values).
|
|
The user owns it (and thus does not force the system to have 64Bit) and the user is also able to stop it (it is the owner) e.g. an app needs to track time does not know how long (maybe the timer wraps) but when the app is stopped its ztime_forever can also be stopped and the second set simply handles that timeout case maybe aquire and release is a better solution (and maybe its portable to ztimer64 (make it releasable) |
|
Yes, when acquire/release semantics are available, ztimer64 might be adjustable to them as well, making it also check the boxes your gist does. Please hold a bit on merging this, I'd like to make amendments from having applied these rules in practice (nothing seriously changing, just additional sufficient criteria one may want to apply, and it might sense to review them as a whole too). |
| * In contrast, the clock is not guaranteed to be active if a timer is | ||
| * removed and then a second one is started (even if the thread does not | ||
| * block between these events), or when an expiring timer wakes up a thread | ||
| * that then sets the second timer. |
There was a problem hiding this comment.
Isn't this the case only when PM is used?
|
In practice, currently, yes, but unless we declare these requirements generally, it becomes very hard to reason about which change around PM might suddenly become a breaking change for someone who doesn't expect it.
ZTimer as an abstraction layer is IMO better served by not leaking implementation details through the API.
|
|
This rises the question will we be able to mark modules that should be PM safe or for which we can guarantee that they do the acquire and release correctly? |
|
This does not yet take into account clocks that might be continuously running, which for some RTC/msec might be a desirable default behavior. |
Yes, but in that case you have to ensure that you application is only executed on boards with I think it's good habit that users of the ztimer API don't have to think if the ztimer clock could stall oder not. This is the main reason why I came up with I shot myself into the foot several times with stalled |
Yes, we have to do some bookkeeping here and should implement necessary |
|
would be a good habit to setup a timer as long as you need access to now (that was part of the reason why i wrote #17229 (it is much like now but allways needs a timer)) but at the same time it is much heavier than aquire() and release(). |
Yeah, but in the end users will need a running timer to proof that they have blocked the clock from stalling, don't they? If yes: In my opinion, that feels a little bit like a workaround and not like a proper solution. Those blocking If we need to touch code that makes use of (Just to make things clear: I don't want to be harsh. I just want to find a good solution that fixes obvious problems in terms of good power management :-)) |
you would not check if your timer runs long enough but set it how long you need it:
and there is another blocker for that, you have to setup a timer before you can ask for now it is very uncommon -> much adaptation work in external packages even though: i think doing it by setting up a timer is a more proper solution (avoids timer-leak and therefor may safe power) than that user-counter workaround. I think your solution is simpler to use and more portable (especially it might also be usable for ztimer64 with all its absolute interfaces ) |
|
Thank you for your detailed reply! (And sry @chrysn for abusing your PR for bike shedding about a related problem!)
Good point - that may happen. I don't know if a higher level API (like this one - for example) could make those problems at least more obvious. But it will be very hard to avoid problems like this with C. C doesn't bring all these nice guarantees Rust is offering ... This mentioned high level API could of course make use of your approach with
In those cases, I would stick to
Time measurement is the only use-case I considered for this acquire/release pattern. That may could become a separate helper module which can handle acquire/release - or setting long timers - in the background. We shouldn't expose this implementation detail.
What bugs me at this solution: When I just want to measure a time difference between two events, the API asks me what the longest expected time period will be. That could be a tricky question. Furthermore, a
Yes right. That are the costs :/ |
If you do not know how long but you know it will be within 32 bits (~1h10min for usec) -> if not you need to mange overflows -> you need a timer anyway. |
|
Ad "This does not yet take into account clocks that might be continuously running": I don't think that should be on the list, or if only with caveats of its own. Most users of ztimer are not the top-level application and thus don't influence whether that is the case or not. And even code residing in the top-level application may be part of a structured application -- there, a part implicitly asserting that the clocks are running would depend on a completely unrelated part. It would certainly be cool if code that makes that assertion can make that in a minimal-overhead fashion (like, a Ad timers leaking: There's already nothing stopping anyone from creating a ztimer_periodic they forget about. Acquire/release semantics are IMO familiar enough to users that they'd manage (with the error likelihood inherent to these things in C). And finally, leaking an acquisition might be exactly what a module wants: If a module needs a never-stopping possibly-wrapping 32bit seconds timer, then a Ad ztimer private: I don't know whether that's a good idea, but being an embedded OS I think we can well afford some not-trivial-to-use functions, if they are well documented.
It is a tricky question, but ultimately one that users better consider rather than not. Once considered, the user will know whether overflows are relevant or not. (Also, in PM'd systems, it is good to have it answered: If we're counting when an event expected within milliseconds arrives, but it never does, for how long are we prepared to keep spending energy?). On the overall "acquire vs. set a long timer", I have no strong preferences. Acquire/release has some nice properties in terms of optimization (being O(1) and the possibility to make it a no-op on systems where the clock never sleeps), but the users that can legitimately use it without odd errors will be very few (for they need external assertions on their differences never wrapping, and if they do work with wrapping they could just as well go for ztimer64 or @kfessel's variation of it that is user-acquired). Anyhow, I doubt that the precise method of acquisition is relevant to this PR (even though it does give valuable input). On the overall status, I've just added two more points in a follow-up commit, asking for another round of review:
|
sys/include/ztimer.h
Outdated
| * | ||
| * If the clock was active, then the difference between the second value and | ||
| * the first is then the elapsed time in the clock's unit, **modulo the | ||
| * clock's wrap-around time**. |
There was a problem hiding this comment.
This is always 2**32. edit
There was a problem hiding this comment.
if you do not jump the 0 or ff..ff then it should be mod 2^32 I think
There was a problem hiding this comment.
I'm not using the precise value due to the presence of the ZTIMER_NOW64 module; if you think that can be disregarded here I'm happy to add that in.
There was a problem hiding this comment.
On second thought (and re-reading the text), I'm putting 2^32 in there with a note on ZTIMER_NOW64 (so that in particular if the latter gets deprecated / removed, the text automatically becomes better usable).
|
LGTM. I really think we should go with "ZTIMER_MSEC is always running". But I guess that's another topic.
I think this refers to the dance with checking a mutex within interrupt? I don't get that yet. |
|
BTW, the extreme cases for "bad" timer configurations are 16bit counters being extended to 32bit in microseconds, or in 250khz (on avr). ztimer is fine until it misses a whole period (one half period can be missed). That would be a maximum of ~65ms resp. ~260ms that ztimer's tick ISR can be late before ztimer loses a tick. In theory. |
comment-only changes should do this automatically. |
|
That would be a maximum of ~65ms resp. ~260ms that ztimer's tick ISR
can be late before ztimer loses a tick. In theory.
Yes. The "1 minute" mention is on the other side, it's not "what is an
upper bound for blocking that a system component introduces" (for which
65ms would already be bad practice) but "what upper bound may a generic
system component assume for the whole application blocking". (A specific
system component like the extended timers on a fast clock might easily
have tighter requirements).
|
It is purely there because unlike ztimer_is_set / ztimer_remove it is constant-time (and thus more suitable for an interrupt handler); granted, it is contrived, so I can switch that over to the more regular ztimer_remove.
(answering anyway because while it makes sense to you now the question is good) It's really doing both things at the same time: The mutex check (which is indeed equivalent to the ztimer_remove check) ensures that the timer is active. To also satisfy the "difference is guaranteed to not have wrapped" one would usually need the extra intermediate step of having interrupts on (which can't be enabled in arbitrary code just like that) -- otherwise the timer might still be in the queue, but the now-value wrapped anyway due to critical sections which ztimer generally tolerates. When the timer was much shorter than the 2**32 ticks, it is safe to assume that the timer fires early enough that even a long critical section won't cause a difference wraparound. Would you prefer if the second example were also expressed in terms of timer removal or is_set? May I squash this? |
|
Pinging @kfessel (ACK holding after the extensions) or @kaspar030 (would you ACK?) |
kfessel
left a comment
There was a problem hiding this comment.
two nitpicks but my ACK hold for the content
| * that then sets the second timer. | ||
| * | ||
| * If the clock was active, then the difference between the second value and | ||
| * the first is then the elapsed time in the clock's unit, **modulo 2³² |
There was a problem hiding this comment.
for all 2³² and 2⁶⁴ i would prefer another notation like 2^32 (2**32 may be missrendered ) ** indicating bold or italic
There was a problem hiding this comment.
We have ² and ³ around already, both in documentation and in strings, including use for exponentiation. The ** easily misrenders, and the ^ triggers LaTeX traumata, so I've left this one out. (doxygen does have math support to do this properly, but I don't think it's worth the time it'd need to set it up for the cases where exponentiation is around).
There was a problem hiding this comment.
2⁶⁴³² << why does github do this
There was a problem hiding this comment.
2**32 is used in l169 same file
i think marking the numbers as code would help against misrendering
There was a problem hiding this comment.
i am ok with any of them
(doc.riot-os.org has meta data utf-8 and superscripts of numbers are certainly not the last chars somone would implement )
There was a problem hiding this comment.
2⁶⁴³²<< why does github do this
I don't see where github mixes the 64 and 32, what do you mean?
There was a problem hiding this comment.
Funny -- I see very subtle differences as it renders here, but they might easily be explained by a font designer trying to badly do oldstyle figures.
What I see in your screenshot more looks like what'd happen when the font was originally cut for latin-1 (which had superscript for digits 1, 2, and 3) and when later extended to he superscript block of Unicode.
Do they render fine in the documentation for you? (deep link into the build artifact)
There was a problem hiding this comment.
yes doc renders fine
these are the fonts github uses for code ui-monospace,SFMono-Regular,SF Mono,Menlo,Consolas,Liberation Mono,monospace;
Liberation Mono is the choosen one on this system
ui-monospace -> sytems monospace font (working in safari) see https://caniuse.com/extended-system-fonts
SFMono -> apples default monospace
Menlo -> apples old default monospace
Consolas -> windows default since 8
There was a problem hiding this comment.
By the way, the gucharmap tool on Linux offers convenient inspection of fall-back font selection -- on my system, it shows superscript 1, 2 and 3 with Bitstream Vera Sans (on right-clicking the glyph's cell), but the others in DejaVu Sans. (Sadly, I know of no tool that allows one to input a list of font names like in CSS and then tells you where there is a Zwiebelfisch (where I contest the articles assessment that the term does not apply here)).
kaspar030
left a comment
There was a problem hiding this comment.
ACK from my side, once Karl's latest comments are addressed.
Closes: RIOT-OS#17298 Co-authored-by: Karl Fessel <[email protected]>
85c87a7 to
632ca35
Compare

Contribution description
ztimer_now() is hard to use correctly, and even though it set out to allow implementation flexibility, expectations are already creeping up based on properties ZTimer happens to have now.
This puts a damper on any expectations one might have on ztimer_now() values. It documents design goals (as I understand them), and may easily be stricter than the current implementation is -- but that is necessary to keep implementation flexibility that is already being exercised in PRs like #17607.
(It may also help motivate the creation of a high-level stopwatch mechanism as sketched in #17607, because yes that's a scary wall of text any user would have to go through).
Open questions
If a timer is set with a timeout way less than 2^32 (say, 2^31, as interrupts need to happen at least every 2^31 clock cycles anyway), it's really also OK to use now values with laxer constraints (eg. it'd be OK to compare to a now value obtained in the timer's interrupt then).
I think we can still relax all of this if it turns out that these are worth mentioning; the current text is phrased to give strictly sufficient criteria, whereas insufficient measures are only given by example.
Testing procedure
Compare built documentation to your mental model and/or the implementation of ZTimer.
Issues/PRs references
Closes: #17298