-
Notifications
You must be signed in to change notification settings - Fork 38.6k
ci: Add Travis check to make sure unit test coverage reports stay deterministic #16320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Add Travis check to make sure unit test coverage reports stay deterministic #16320
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
I think there is a value to non-deterministic tests, as long as their seed is known (e.g printed to stdout at the beginning of the test, or written to a file on failure, ...) |
|
I also think that non-deterministic tests can be valuable; as an example, a non-deterministic functional test ( So if this PR's intention is to move towards eliminating non-determinism as a tool in our testing, then I tend to concept NACK. |
|
It seems like I didn't properly communicate the goal of this PR :-) Sorry about that. The goal of this PR is to make sure line coverage-reports are deterministic. This PR addresses #14343: "coverage reports non-deterministic" (opened by you @MarcoFalke :-)) The problem is described in MarcoFalke's issue:
More specifically this PR makes sure line coverage is stable between unit test runs - from the OP of this PR:
I've now updated the PR title to better reflect the goal of this PR. @MarcoFalke @sdaftuar Just to make sure we share the same goal: we all want line coverage unit tests reports to be deterministic, right? :-) |
|
I'm not familiar with the line coverage reports at all and don't really have a view there. If we merged this PR, then in the future if someone wanted to write a unit test that was non-deterministic, what should they do if this linter fails? |
|
@sdaftuar If line coverage fluctuation in a test is intentional then one would simply add a suppression to This PR is only meant to guard against accidental introduction of new tests which turns the line coverage number into a random process. More concretely the situation I want to guard against is the following: line coverage is reported as 80,5% on first run, 80,6% on second run, 80,4% on third run, etc :-) Some context:
|
|
@practicalswift Thanks for clarifying that. I tend to think that test writers face too many hurdles and knowledge requirements about the repo (and figuring out how to deal with linters that provide unclear benefits when applied as a hard-and-fast rule, like this one would). It's demoralizing to include a test for something and then have some unrelated script break for spurious reasons that requires learning and investigating something new in order to make travis pass. So I think I'm still a Concept NACK. I think our concerns about line-coverage, which reflect a goal about the whole project's test suite, should not provide too much inconvenience for contributors working on a smaller piece. |
|
@MarcoFalke As the person who filed issue #14343 ("coverage reports non-deterministic") which this PR tries to address: can you clarify your position? Personally I kind of took it for granted that we want to know how suggested PR:s changes unit test line coverage. Assumptions are dangerous :-) If the consensus opinion is that we are okay with the unit test line coverage sometimes being a random process (80,5% on first run, 80,6% on second run, 80,4% on third run, etc.) then this PR is clearly not needed and I'll close :-) |
I'm more concerned about the potential for reduced input from high-quality, long term contributors due to the overhead of new scripts, lints, tools etc than I am about occasionally seeing a small variation in coverage output. So I'm in agreement with @sdaftuar here. |
|
I think it is nice to know whether a new test has stable line coverage, but it non-stable coverage shouldn't be blocking a merge. Especially, since it might not be trivial to make them stable (c.f. #14343 (comment)) I guess I wouldn't mind if some of the linter were put behind the |
|
@MarcoFalke Using |
Add Travis check to make sure unit test coverage reports stay deterministic.
Rationale:
A necessary condition for meaningful line coverage measuring is that the test suite
is deterministic in the sense that the set of lines executed at least once is identical between test suite runs.
This PR addresses issue #14343 (MarcoFalke): "coverage reports non-deterministic":
Example output in case of line coverage deterministic unit tests:
Example output in case of line coverage non-deterministic unit tests:
This Travis check uses
test_deterministic_coverage.shwhich was introduced in #15296.To make sure
test_deterministic_coverage.shwon't trigger any line coverage non-determinism alarms with the current test suite I've performed 8 000 test runs (against 98958c8) which all gave identical line coverage results.Note to reviewers: Which would be the most appropriate Travis job to put this on? I'm currently using
x86_64 Linux [GOAL: install] [bionic] [uses qt5 dev package instead of depends Qt to speed up build and avoid timeout], but I'm afraid the total run-time of that job is a bit on the high end with this check included. Would it be preferable to add a new job instead of adding to an existing job? Please advice :-)