Add bench.sh script to automate benchmarking DataFusion against itself#6131
Merged
alamb merged 2 commits intoapache:mainfrom Apr 30, 2023
Merged
Add bench.sh script to automate benchmarking DataFusion against itself#6131alamb merged 2 commits intoapache:mainfrom
alamb merged 2 commits intoapache:mainfrom
Conversation
Contributor
Author
|
Some interesting results already -- I ran a quick experiment to see how much 'lto' link time optimization helps. The answer is "quite a bit" |
alamb
commented
Apr 28, 2023
benchmarks/README.md
Outdated
|
|
||
| ## `tpch` Benchmark derived from TPC-H | ||
|
|
||
| These benchmarks are derived from the [TPC-H][1] benchmark. And we use this repo as the source of tpch-gen and answers: |
Contributor
Author
There was a problem hiding this comment.
I next hope / plan tor review the other benchmarks and consolidate them and their data generation and runner scripts into the bench.sh framework
This was referenced Apr 28, 2023
yjshen
approved these changes
Apr 30, 2023
| # Gather baseline data for tpch benchmark | ||
| ./benchmarks/bench.sh run tpch | ||
|
|
||
| # Switch to the branch the branch name is mybranch and gather data |
Member
There was a problem hiding this comment.
👍 I was curious before about what's the magic for comparing branches
Contributor
Author
There was a problem hiding this comment.
Thanks for the review @yjshen -- I am trying to reduce the amount of magic involved.
I am going to merge this in and we can continue to iterate (next I would like to increase the number of different tests supported)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #6127
Rationale for this change
TLDR to make it easier to run the benchmarks included with DataFusion with a standard set of scenarios
See #6127
What changes are included in this PR?
This script currently supports two benchmarks as shown in the usage instructions.
Are these changes tested?
I tested them manually on an x86 mac and a Linux x86 machine.
Are there any user-facing changes?
No, it is just development scripts