-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[perf-counters] Fix pause/resume #1643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test/perf_counters_test.cc
Outdated
| } | ||
|
|
||
| static void CheckInstrCount(Results const& e) { | ||
| CHECK_COUNTER_VALUE(e, double, "INSTRUCTIONS", GT, kIters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with pause and resume, shouldn't this be GT 2*kIters? i don't know how the INSTRUCTIONS perf counters work so i could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the point is that the second loop - the one within the pause/resume - is not counted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah ok.. but then if it is counted > kIters is still going to be true. so i think you need a LT check too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - the current test checks against the original regression, but I see what you mean. Ideally we'd check that the value in the "with pause resume" is < 2 * "value in the without pause resume". Is there a recommended way to do that, other than capturing the value in the "without pause resume" in some global and relying on execution order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would LT, 1.5*kIters not be enough at least for an upper bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each iteration executes a bit of instructions, and it depends on opt level how many (and architecture)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't rely on execution order. we can guarantee that both have run when RunOutputTests completes, so maybe we can store the number of instructions for both BMs and compare them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
LebedevRI
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is rather fragile, but seems fine in general.
Thank you.
Using `state.PauseTiming() / state.ResumeTiming()` was broken. Thanks [@virajbshah] for the the repro testcase.
sequences would cause a non-0 counter value
|
thanks! |
Previously, `counters` was updated in `PauseTiming()` with `counters[name] += Counter(measurement, kAvgIteration)`. The first `counters[name]` call inserts a counter with no flags. There is no `operator+=` for `Counter`, so the insertion is done by converting the `Counter` to a `double`, then constructing a `Counter` to insert from the `double`, which drops the flags. Pre-insert the `Counter` with the correct flags, then only update `Counter::value`. Introduced in 1c64a36 ([perf-counters] Fix pause/resume (google#1643)).
* State: Initialize counters with kAvgIteration in constructor Previously, `counters` was updated in `PauseTiming()` with `counters[name] += Counter(measurement, kAvgIteration)`. The first `counters[name]` call inserts a counter with no flags. There is no `operator+=` for `Counter`, so the insertion is done by converting the `Counter` to a `double`, then constructing a `Counter` to insert from the `double`, which drops the flags. Pre-insert the `Counter` with the correct flags, then only update `Counter::value`. Introduced in 1c64a36 ([perf-counters] Fix pause/resume (#1643)). * perf_counters_test.cc: Don't divide by iterations Perf counters are now divided by iterations, so dividing again in the test is wrong. * State: Fix shadowed param error * benchmark.cc: Fix clang-tidy error --------- Co-authored-by: dominic <[email protected]>
Using
state.PauseTiming() / state.ResumeTiming()was broken.Thanks @virajbshah for the the repro testcase.