Skip to content

ARROW-10031: [CI][Java] Support Java benchmark in Archery#8210

Closed
kiszk wants to merge 14 commits intoapache:masterfrom
kiszk:ARROW-10031
Closed

ARROW-10031: [CI][Java] Support Java benchmark in Archery#8210
kiszk wants to merge 14 commits intoapache:masterfrom
kiszk:ARROW-10031

Conversation

@kiszk
Copy link
Copy Markdown
Member

@kiszk kiszk commented Sep 17, 2020

This PR supports Java benchmark in Ursabot. The implementation is based on this suggestion

Here are work items.

  • Support --language=[cpp|java] option in diff
  • Enable to build java binding
  • Enable to run Java benchmarks
  • Allows us to filter/select benchmarks
  • Enable to collect results
  • Apply the same changes to run and list

@github-actions
Copy link
Copy Markdown

@kiszk kiszk force-pushed the ARROW-10031 branch 3 times, most recently from d719b0d to 0e47ab6 Compare September 22, 2020 00:36
@liyafan82
Copy link
Copy Markdown
Contributor

@kiszk Thank you for doing this.
Please note that when running the benchmarks, some flags should be configured properly.
They can be set through environmental variables:

ARROW_ENABLE_UNSAFE_MEMORY_ACCESS = true
ARROW_ENABLE_NULL_CHECK_FOR_GET = false

or through system properties:

arrow.enable_unsafe_memory_access = true
arrow.enable_null_check_for_get = false

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Sep 23, 2020

@liyafan82 Thank you for your comment. I will set these two properties as default for Java benchmarking
.

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Nov 11, 2020

At my end, I can generate the following JSON file by archery benchmark diff --language=java ...

@liyafan82 any comments regarding format and parameters are appreciated.

                                                  benchmark      baseline     contender  change %                                                                                                                                                                                                           counters
0      org.apache.arrow.vector.IntBenchmarks.setIntDirectly  22.500 us/op  11.209 us/op   -50.182  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 4, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}
2  org.apache.arrow.vector.IntBenchmarks.setWithValueHolder  19.031 us/op   6.627 us/op   -65.179  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 4, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}
1       org.apache.arrow.vector.IntBenchmarks.setWithWriter  32.626 us/op  10.246 us/op   -68.594  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 4, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}

@liyafan82
Copy link
Copy Markdown
Contributor

At my end, I can generate the following JSON file by archery benchmark diff --language=java ...

@liyafan82 any comments regarding format and parameters are appreciated.

                                                  benchmark      baseline     contender  change %                                                                                                                                                                                                           counters
0      org.apache.arrow.vector.IntBenchmarks.setIntDirectly  22.500 us/op  11.209 us/op   -50.182  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 4, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}
2  org.apache.arrow.vector.IntBenchmarks.setWithValueHolder  19.031 us/op   6.627 us/op   -65.179  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 4, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}
1       org.apache.arrow.vector.IntBenchmarks.setWithWriter  32.626 us/op  10.246 us/op   -68.594  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 4, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}

@kiszk Thanks for your effort. Generally, it looks great! Some minor comments:

  1. It is clearer to rename title 'counters' to 'configuration'?
  2. I am curious how are the benchmarks sorted, by 'change %'?

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Nov 11, 2020

For 1., I will rename the title for cpp and Java.

For 2, you are right as sorted here.

@kiszk kiszk force-pushed the ARROW-10031 branch 3 times, most recently from 3ebd9f6 to 8cdd1ea Compare November 13, 2020 01:32
@kiszk kiszk marked this pull request as ready for review November 13, 2020 04:59
@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Nov 13, 2020

@liyafan82 @fsaintjacques @kszucs Would it be possible to review this?

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Nov 13, 2020

The following commands should work:

archery benchmark list --langauge=java
archery benchmark run --langauge=java
archery benchmark diff --langauge=java

Here is an example

archery benchmark run  ARROW-10031  --output=out-java.json --language=java  --benchmark-filter=IntBench --java-options=-mx1g --build-extras=-Dfoo.bar=1 --benchmark-extras=-Dbar.foo=2

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liyafan82 Here is a change from #8245 . Since I cannot find to add -rf json or not to add it in <arguments>, the current implementation always generates jmh-result.json and change a file name if it will be overridden.

I would like to update here if there is a selection method to add -rf json or not to add it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. Thanks.

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Dec 20, 2020

ping @liyafan82 @fsaintjacques @kszucs

@liyafan82
Copy link
Copy Markdown
Contributor

The Java changes look good to me. However, I am not farmiliar with the archery code. So @fsaintjacques @kszucs could you please take a look?

@kszucs kszucs changed the title ARROW-10031: [CI][Java] Support Java benchmark in Ursabot ARROW-10031: [CI][Java] Support Java benchmark in Archery Dec 21, 2020
@kszucs
Copy link
Copy Markdown
Member

kszucs commented Dec 21, 2020

Thanks for working on this! At the moment we can only try this out locally since we no longer maintain the buildbot/ursabot setup. I have limited time to review this, so I'd say that if the benchmarking using archery works locally then is should be good to go.

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Jan 11, 2021

I see. Thank you for sharing the latest status.

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Jan 11, 2021

@bkietz @xhochy May I ask to review this if possible?

@xhochy
Copy link
Copy Markdown
Member

xhochy commented Jan 11, 2021

Leaving this for @bkietz , archery is not my area of expertise.

@bkietz bkietz self-requested a review January 11, 2021 15:58
Copy link
Copy Markdown
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I'd recommend refactoring for less repetition of code between the classes for handling java and C++ builds

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, please convert to "items_per_second" or "bytes_per_second" in JavaMicrobenchmarkHarnessObservation.unit

Comment on lines 36 to 37
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't apply to java benchmarks; seems copy pasted from GoogleBenchmarkCommand?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put docstrings on the first line of function definitions

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please make it clear that you're using the strings "Benchmarks:" and "[INFO]" to delimit lines containing benchmark names

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
benchmarks = False

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd that this class inherits Maven instead of Command, especially since it has a member self.maven

Suggested change
class JavaMicrobenchmarkHarnessCommand(Maven):
class JavaMicrobenchmarkHarnessCommand(Command):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This division seems unnecessary. Please just keep a single benchmark_list function (moving the branch on language into the with block)

Comment on lines 46 to 52
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend doing the cast-to-list here; if an iterator were passed for build_extras then it'd be depleted after the first access to build_definitions:

Suggested change
self.build_extras = build_extras
self.benchmark_extras = benchmark_extras
@property
def build_definitions(self):
extras = list(self.build_extras) if self.build_extras else []
return extras
self.build_extras = list(build_extras) if build_extras else []
self.benchmark_extras = list(benchmark_extras) if benchmark_extras else []
@property
def build_definitions(self):
return self.build_extras

I see that CppConfiguration has this same flaw; it's not necessary for you to address it there too

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This division also seems unnecessary; CppBenchmarkRunner and JavaBenchmarkRunner seem similar enough to share more code here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why separate the languages' benchmarks? It seems especially valuable to have a clearly unified strategy for comparison of benchmarks

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This separation comes from @click.pass_context. To use different options between two languages (e.g. java-home or cxx-flags), my understanding was that we need different methods with the @click.pass_context.

Let me try to have one method with @click.pass_context by lazily parse arguments for each language.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
directory. If so, it creates a JavaenchmarkRunner with this existing
directory. If so, it creates a JavaBenchmarkRunner with this existing

Again, seems like this code didn't need to be repeated

@emkornfield
Copy link
Copy Markdown
Contributor

@kiszk @bkietz are there more revisions needed here? (it looks like this at least needs a rebase?)

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Feb 1, 2021

@emkornfield @bkietz Thank you for kind ping. Yes, I need an update since I was swamped last month. I will update this PR this week.

@emkornfield
Copy link
Copy Markdown
Contributor

CC @dianaclarke who I think has also been working on continuous benchmarking.

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Apr 12, 2021

Here is an example.

% archery benchmark diff --language=java --benchmark-filter="setWith"  HEAD HEAD~1
...
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (2)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                benchmark            baseline           contender  change %                                                                                                                                                                                                     configurations
 org.apache.arrow.vector.IntBenchmarks.setWithValueHolder  165.925K items/sec  165.898K items/sec    -0.016  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 1, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}
      org.apache.arrow.vector.IntBenchmarks.setWithWriter  222.780K items/sec  221.768K items/sec    -0.454  {'mode': 'avgt', 'threads': 1, 'warmups': 5, 'warmupTime': '10 s', 'measurements': 1, 'measurementTime': '10 s', 'jvmArgs': ['-Darrow.enable_null_check_for_get=false -Darrow.enable_unsafe_memory_access=true']}

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Apr 12, 2021

@emkornfield @bkietz @dianaclarke @liyafan82 I am very sorry for being late to address review comments.

Now, I addressed all of the comments and rebased with the master. Would it be possible to review this PR again?

@kszucs
Copy link
Copy Markdown
Member

kszucs commented Apr 14, 2021

@ursabot --help

@ursabot
Copy link
Copy Markdown

ursabot commented Apr 14, 2021

Supported benchmark command examples:

@ursabot benchmark help

To run all benchmarks:
@ursabot please benchmark

To filter benchmarks by language:
@ursabot please benchmark lang=Python
@ursabot please benchmark lang=C++
@ursabot please benchmark lang=R

To filter Python and R benchmarks by name:
@ursabot please benchmark name=file-write
@ursabot please benchmark name=file-write lang=Python
@ursabot please benchmark name=file-.*

To filter C++ benchmarks by archery --suite-filter and --benchmark-filter:
@ursabot please benchmark command=cpp-micro --suite-filter=arrow-compute-vector-selection-benchmark --benchmark-filter=TakeStringRandomIndicesWithNulls/262144/2 --iterations=3

For other command=cpp-micro options, please see https://github.com/ursacomputing/benchmarks/blob/main/benchmarks/cpp_micro_benchmarks.py

@kszucs
Copy link
Copy Markdown
Member

kszucs commented Apr 14, 2021

@ursabot please benchmark lang=C++

@kszucs
Copy link
Copy Markdown
Member

kszucs commented Apr 14, 2021

I wouldn't be too pedantic here, we can address any issues later. Nice addition, thanks @kiszk!

I submitted a conbench build to check that the archery benchmark command remained compatible.

@emkornfield
Copy link
Copy Markdown
Contributor

emkornfield commented Apr 25, 2021

Should this be merged?

@ursabot
Copy link
Copy Markdown

ursabot commented Apr 25, 2021

Benchmark runs are scheduled for baseline = 5b08205 and contender = 5d70561. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-large-us-east-2
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️2.08% ⬆️2.14%] ursa-thinkcentre-m75q

@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented Apr 29, 2021

@bkietz @dianaclarke @liyafan82 Would you have additional comments?

@liyafan82
Copy link
Copy Markdown
Contributor

@bkietz @dianaclarke @liyafan82 Would you have additional comments?

I have no more comments. Hope this PR will be merged soon.

Copy link
Copy Markdown
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

If wee need more refactoring, we can work on it as a follow-up task.

@kou kou closed this in 2ece340 Apr 30, 2021
@kiszk
Copy link
Copy Markdown
Member Author

kiszk commented May 1, 2021

Thanks very much for your comments

pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
This PR supports Java benchmark in Ursabot. The implementation is based on [this suggestion](https://mail-archives.apache.org/mod_mbox/arrow-dev/202008.mbox/%3cCABNn7+q35j7QWsHJBX8omdewKT+F1p_M7r1_F6szs4dqc+Luyg@mail.gmail.com%3e)

Here are work items.
- [x] Support `--language=[cpp|java]` option in `diff`
- [x] Enable to build java binding
- [x] Enable to run Java benchmarks
- [x] Allows us to filter/select benchmarks
- [x] Enable to collect results
- [x] Apply the same changes to `run` and `list`

Closes apache#8210 from kiszk/ARROW-10031

Authored-by: Kazuaki Ishizaki <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants