-
Notifications
You must be signed in to change notification settings - Fork 38.6k
test: Run bench sanity checks in parallel with functional tests #33142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33142. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
fa1e1bd to
fa04ef4
Compare
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
fa4a481 to
fabfdba
Compare
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
fa829da to
faf7f97
Compare
|
I get an error compiling this without |
faf7f97 to
fa9207c
Compare
thx, fixed |
fa9207c to
faddf2f
Compare
fadbedc to
fa88e47
Compare
fa88e47 to
faa172d
Compare
faa172d to
fa74bc1
Compare
fa74bc1 to
fac86b6
Compare
|
@maflcko, @purpleKarrot, can you help reviewers understand the relation between this and #33483? |
It is basically the same thing, just written in different languages. I don't really care about the language it is written in. Though, I don't think anyone is going to review the other pull, which looks unmaintained? #33483 (comment) |
fac86b6 to
fafb0b4
Compare
|
Given that this pull request has been maintained for almost half a year, with the goal of fixing all the annoying QA issues listed in the pull description, my recommendation would be to go ahead and review this one, merge it, and possibly revert it in #33483, if and when that is ready. The review, testing, and revert should be not too hard. For testing notes (since someone asked privately): I think compiling once master, and once this pull request (possibly with heavy sanitizers and/or debugging enabled) should be enough. Then, check that the new approach is faster (albeit speed is just one of the QA issues to fix, and probably the least important):
|
janb84
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fafb0b4
Given that this pull request has been maintained for almost half a year, with the goal of fixing all the annoying QA issues listed in the pull description, my recommendation would be to go ahead and review this one, merge it, and possibly revert it in #33483, if and when that is ready. The review, testing, and revert should be not too hard.
I agree with this approach.
Ran test with sanitizers enabled, as per comment:
Master: 42 seconds.
This PR: 13 seconds.
Note: The direct consequence of this PR is that the functional test are a bit slower due to the addition of the bench sanity checks 280 vs 452. For CI this has no impact because both ran anyway. I do not see this as a downsite, it's just a logical consequence of this PR.
Also the output is more verbose, with this change it is visible which sub test has ran and the progress on these subtests. This is something positive imho, (This feature is also outlined in the pr description)
The sanity checks are skipped if not compiled (as per code):
1/280 - tool_bench_sanity_check.py skipped (bitcoin_bench has not been compiled)
|
Before (9s): git checkout master && \
cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_BENCH=ON && \
cmake --build build -j$(nproc) && \
time ctest --test-dir build --tests-regex bench_sanity_checkresults in: Test project /Users/lorinc/IdeaProjects/bitcoin/build
Start 7: bench_sanity_check
1/1 Test #7: bench_sanity_check ............... Passed 9.11 sec
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 9.11 secAfter (3s): git checkout fafb0b495235a32b7622f2d9be07ebfbb5344a05 && \
cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_BENCH=ON && \
cmake --build build -j$(nproc) && \
time ./build/test/functional/test_runner.py tool_bench_sanity_check.py -j$(nproc)results in: Concept ACK - I will review the code a bit later. |
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change makes sense, I think we should fix a few references to the benchmarking binary (and keep the alphabetical order of the benchmarks), otherwise LGTM.
diff --git a/test/functional/test_framework/test_framework.py b/test/functional/test_framework/test_framework.py
index 31660cad9c..ecc9ffa231 100755
--- a/test/functional/test_framework/test_framework.py
+++ b/test/functional/test_framework/test_framework.py
@@ -944,9 +944,9 @@ class BitcoinTestFramework(metaclass=BitcoinTestMetaClass):
raise SkipTest("bitcoin-chainstate has not been compiled")
def skip_if_no_bitcoin_bench(self):
- """Skip the running test if bitcoin_bench has not been compiled."""
+ """Skip the running test if bench_bitcoin has not been compiled."""
if not self.is_bench_compiled():
- raise SkipTest("bitcoin_bench has not been compiled")
+ raise SkipTest("bench_bitcoin has not been compiled")
def skip_if_no_cli(self):
"""Skip the running test if bitcoin-cli has not been compiled."""
@@ -982,7 +982,7 @@ class BitcoinTestFramework(metaclass=BitcoinTestMetaClass):
raise SkipTest("This test is not compatible with Valgrind.")
def is_bench_compiled(self):
- """Checks whether bitcoin_bench was compiled."""
+ """Checks whether bench_bitcoin was compiled."""
return self.config["components"].getboolean("BUILD_BENCH")
def is_cli_compiled(self):
diff --git a/test/functional/test_framework/util.py b/test/functional/test_framework/util.py
index d53c59297f..0c190ee936 100644
--- a/test/functional/test_framework/util.py
+++ b/test/functional/test_framework/util.py
@@ -262,7 +262,7 @@ class Binaries:
return self._argv("rpc", self.paths.bitcoincli) + ["-nonamed"]
def bench_argv(self):
- "Return argv array that should be used to invoke bitcoin_bench"
+ "Return argv array that should be used to invoke bench_bitcoin"
return self._argv("bench", self.paths.bitcoin_bench)
def tx_argv(self):
diff --git a/test/functional/test_runner.py b/test/functional/test_runner.py
index 28ef0d010f..a4af557846 100755
--- a/test/functional/test_runner.py
+++ b/test/functional/test_runner.py
@@ -528,10 +528,10 @@ def main():
# Remove it, and expand it for each bench in the list
test_list.remove(TOOL_BENCH_SANITY_CHECK)
bench_cmd = Binaries(get_binary_paths(config), bin_dir=None).bench_argv() + ["-list"]
- bench_list = subprocess.check_output(bench_cmd).decode("ascii").splitlines()
+ bench_list = subprocess.check_output(bench_cmd, text=True).splitlines()
bench_list = [f"{TOOL_BENCH_SANITY_CHECK} --bench={b}" for b in bench_list]
# Start with special scripts (variable, unknown runtime)
- test_list.extendleft(bench_list)
+ test_list.extendleft(reversed(bench_list))
if args.filter:
test_list = deque(filter(re.compile(args.filter).search, test_list))
diff --git a/test/functional/tool_bench_sanity_check.py b/test/functional/tool_bench_sanity_check.py
index ad70fcb9be..d29587861c 100755
--- a/test/functional/tool_bench_sanity_check.py
+++ b/test/functional/tool_bench_sanity_check.py
@@ -4,6 +4,7 @@
# file COPYING or https://opensource.org/license/mit/.
"""Special script to run each bench sanity check
"""
+import shlex
import subprocess
from test_framework.test_framework import BitcoinTestFramework
@@ -31,9 +32,9 @@ class BenchSanityCheck(BitcoinTestFramework):
f"-filter={self.options.bench}",
"-sanity-check",
]
- self.log.info(f"Starting to run: {cmd}")
+ self.log.info(f"Starting: {shlex.join(cmd)}")
subprocess.run(cmd, check=True)
- self.log.info(f"Command passed: {cmd}")
+ self.log.info("Success!")
if __name__ == "__main__":This teaches the test framework about the bench executable, which is required for the next commit.
fafb0b4 to
fa65bc0
Compare
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested ACK fa65bc0
janb84
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK fa65bc0
Changes since last ACK:
mainly related to using bench_bitcoin instead of bitcoin_bench
keeping the ordering the same by reversing an (unwanted) reversed list.
|
Also verified the speedup: master: src/core/bitcoin on master [$?⇕] via △ v4.1.2 via 🐍 v3.13.9 via ❄️ impure (nix-shell-env)
❯ ctest --test-dir ./build --tests-regex bench_sanity_check
Test project /home/will/src/core/bitcoin/build
Start 7: bench_sanity_check
1/1 Test #7: bench_sanity_check ............... Passed 18.20 sec
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 18.21 secPR #33142: src/core/bitcoin on pr-33142 [$?] via △ v4.1.2 via 🐍 v3.13.9 via ❄️ impure (nix-shell-env) took 25s
❯ build/test/functional/test_runner.py -j 128 tool_bench_sanity_check.py
Temporary test directory at /tmp/test_runner_₿_🏃_20260105_135859
<snip>
ALL | ✓ Passed | 272 s (accumulated)
Runtime: 8 s |
willcl-ark
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa65bc0
This test alone can take 2 minutes or longer in CI, which is reduced here to ~20 seconds when parallelised as here in this changeset. Agree that we can revert if/when we move to a more cmake-native solution in the future.
|
ACK fa65bc0 Merging this, but this is not saying that this direction is set in stone. This PR gives us some immediate wins that are useful to all developers with a fairly small amount of code changed. However, rewriting everything to utilize ctest is still on the table, but that does seem like a much larger task that will take significantly longer to implement and review. |
The ctest target
bench_sanity_checkhas many issues:75/153 Test #9: bench_sanity_check ...................***Failed 770.84 sec out of memoryBoth issues can lead to CI timeouts and leave CPU unused during testing.
Fix all issues by running it as part of the functional tests instead. This is similar to the rpcauth tests (#32881) and util tests [bitcoin-tx, and bitcoin-util] (#32697).