-
Notifications
You must be signed in to change notification settings - Fork 0
Description
The problem
I've seen quite a few times recently that benchmark measures CodSpeed gives on PRs are erroneous.
This is a problem when doing perf work, as you can't tell if what you're doing is good or not.
e.g. PR oxc-project/oxc#4214 showed initially a giving 0 speed-up, but then benchmarks re-ran after the PR below it in the stack was merged, and suddenly it shows 6% perf improvement. https://codspeed.io/oxc-project/oxc/branches/07-12-perf_semantic_reduce_lookups
That's wrong. The PR gives 0 perf improvement.
Reason was that in the last run, CodSpeed did the comparison to 2 commits back (3016f03), rather than 1 back. So 6% result shown included the perf boost of oxc-project/oxc#4213 which is the commit that preceded it.
Why?
I am not sure why this has started happening recently. Could be:
- Changes at CodSpeed's end.
- Caused by our switch to using Graphite merge queue.
Solutions
- Raise with CodSpeed.
- If they can't fix, investigate if we can handle it somehow at our end.
Because we intercept and store bench results and upload them to CodSpeed our end, we could potentially get our Github action to check that benchmarks for previous commit have completed and been uploaded to CodSpeed already, before submitting results for current commit. If not, wait until they are.