ARROW-10031: [CI][Java] Support Java benchmark in Archery#8210
ARROW-10031: [CI][Java] Support Java benchmark in Archery#8210kiszk wants to merge 14 commits intoapache:masterfrom
Conversation
d719b0d to
0e47ab6
Compare
|
@kiszk Thank you for doing this. ARROW_ENABLE_UNSAFE_MEMORY_ACCESS = true or through system properties: arrow.enable_unsafe_memory_access = true |
|
@liyafan82 Thank you for your comment. I will set these two properties as default for Java benchmarking |
|
At my end, I can generate the following JSON file by @liyafan82 any comments regarding format and parameters are appreciated. |
@kiszk Thanks for your effort. Generally, it looks great! Some minor comments:
|
|
For 1., I will rename the title for cpp and Java. For 2, you are right as sorted here. |
3ebd9f6 to
8cdd1ea
Compare
|
@liyafan82 @fsaintjacques @kszucs Would it be possible to review this? |
|
The following commands should work: Here is an example |
java/performance/pom.xml
Outdated
There was a problem hiding this comment.
@liyafan82 Here is a change from #8245 . Since I cannot find to add -rf json or not to add it in <arguments>, the current implementation always generates jmh-result.json and change a file name if it will be overridden.
I would like to update here if there is a selection method to add -rf json or not to add it.
|
The Java changes look good to me. However, I am not farmiliar with the archery code. So @fsaintjacques @kszucs could you please take a look? |
|
Thanks for working on this! At the moment we can only try this out locally since we no longer maintain the buildbot/ursabot setup. I have limited time to review this, so I'd say that if the benchmarking using archery works locally then is should be good to go. |
|
I see. Thank you for sharing the latest status. |
|
Leaving this for @bkietz , |
bkietz
left a comment
There was a problem hiding this comment.
In general, I'd recommend refactoring for less repetition of code between the classes for handling java and C++ builds
There was a problem hiding this comment.
Instead, please convert to "items_per_second" or "bytes_per_second" in JavaMicrobenchmarkHarnessObservation.unit
dev/archery/archery/benchmark/jmh.py
Outdated
There was a problem hiding this comment.
This comment doesn't apply to java benchmarks; seems copy pasted from GoogleBenchmarkCommand?
dev/archery/archery/benchmark/jmh.py
Outdated
There was a problem hiding this comment.
Please put docstrings on the first line of function definitions
There was a problem hiding this comment.
Also, please make it clear that you're using the strings "Benchmarks:" and "[INFO]" to delimit lines containing benchmark names
dev/archery/archery/benchmark/jmh.py
Outdated
There was a problem hiding this comment.
| benchmarks = False |
dev/archery/archery/benchmark/jmh.py
Outdated
There was a problem hiding this comment.
It's odd that this class inherits Maven instead of Command, especially since it has a member self.maven
| class JavaMicrobenchmarkHarnessCommand(Maven): | |
| class JavaMicrobenchmarkHarnessCommand(Command): |
dev/archery/archery/cli.py
Outdated
There was a problem hiding this comment.
This division seems unnecessary. Please just keep a single benchmark_list function (moving the branch on language into the with block)
dev/archery/archery/lang/java.py
Outdated
There was a problem hiding this comment.
I'd recommend doing the cast-to-list here; if an iterator were passed for build_extras then it'd be depleted after the first access to build_definitions:
| self.build_extras = build_extras | |
| self.benchmark_extras = benchmark_extras | |
| @property | |
| def build_definitions(self): | |
| extras = list(self.build_extras) if self.build_extras else [] | |
| return extras | |
| self.build_extras = list(build_extras) if build_extras else [] | |
| self.benchmark_extras = list(benchmark_extras) if benchmark_extras else [] | |
| @property | |
| def build_definitions(self): | |
| return self.build_extras |
I see that CppConfiguration has this same flaw; it's not necessary for you to address it there too
dev/archery/archery/cli.py
Outdated
There was a problem hiding this comment.
This division also seems unnecessary; CppBenchmarkRunner and JavaBenchmarkRunner seem similar enough to share more code here
dev/archery/archery/cli.py
Outdated
There was a problem hiding this comment.
Again, why separate the languages' benchmarks? It seems especially valuable to have a clearly unified strategy for comparison of benchmarks
There was a problem hiding this comment.
This separation comes from @click.pass_context. To use different options between two languages (e.g. java-home or cxx-flags), my understanding was that we need different methods with the @click.pass_context.
Let me try to have one method with @click.pass_context by lazily parse arguments for each language.
There was a problem hiding this comment.
| directory. If so, it creates a JavaenchmarkRunner with this existing | |
| directory. If so, it creates a JavaBenchmarkRunner with this existing |
Again, seems like this code didn't need to be repeated
|
@emkornfield @bkietz Thank you for kind ping. Yes, I need an update since I was swamped last month. I will update this PR this week. |
d4608a9 to
356c300
Compare
|
CC @dianaclarke who I think has also been working on continuous benchmarking. |
|
Here is an example. |
|
@emkornfield @bkietz @dianaclarke @liyafan82 I am very sorry for being late to address review comments. Now, I addressed all of the comments and rebased with the master. Would it be possible to review this PR again? |
|
@ursabot --help |
|
Supported benchmark command examples:
To run all benchmarks: To filter benchmarks by language: To filter Python and R benchmarks by name: To filter C++ benchmarks by archery --suite-filter and --benchmark-filter: For other |
|
@ursabot please benchmark lang=C++ |
|
Benchmark runs are scheduled for baseline = 5b08205 and contender = 5d70561. Results will be available as each benchmark for each run completes: |
|
I wouldn't be too pedantic here, we can address any issues later. Nice addition, thanks @kiszk! I submitted a conbench build to check that the archery benchmark command remained compatible. |
|
Should this be merged? |
|
Benchmark runs are scheduled for baseline = 5b08205 and contender = 5d70561. Results will be available as each benchmark for each run completes. |
|
@bkietz @dianaclarke @liyafan82 Would you have additional comments? |
I have no more comments. Hope this PR will be merged soon. |
kou
left a comment
There was a problem hiding this comment.
+1
If wee need more refactoring, we can work on it as a follow-up task.
|
Thanks very much for your comments |
This PR supports Java benchmark in Ursabot. The implementation is based on [this suggestion](https://mail-archives.apache.org/mod_mbox/arrow-dev/202008.mbox/%3cCABNn7+q35j7QWsHJBX8omdewKT+F1p_M7r1_F6szs4dqc+Luyg@mail.gmail.com%3e) Here are work items. - [x] Support `--language=[cpp|java]` option in `diff` - [x] Enable to build java binding - [x] Enable to run Java benchmarks - [x] Allows us to filter/select benchmarks - [x] Enable to collect results - [x] Apply the same changes to `run` and `list` Closes apache#8210 from kiszk/ARROW-10031 Authored-by: Kazuaki Ishizaki <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
This PR supports Java benchmark in Ursabot. The implementation is based on this suggestion
Here are work items.
--language=[cpp|java]option indiffrunandlist