ARROW-14374: [Java] Integration tests for the C data Interface implementation for Java#11543
ARROW-14374: [Java] Integration tests for the C data Interface implementation for Java#11543tomersolomon1 wants to merge 4 commits intoapache:masterfrom
Conversation
…entation for Java Lead-authored-by: tomersolomon1 <[email protected]> Co-authored-by: tomersolomon1 <[email protected]> Co-authored-by: roee88 <[email protected]> Signed-off-by: tomersolomon1 <[email protected]>
|
|
| - ${DOCKER_VOLUME_PREFIX}debian-ccache:/ccache:delegated | ||
| command: | ||
| [ "/arrow/ci/scripts/cpp_build.sh /arrow /build && | ||
| /arrow/ci/scripts/python_build.sh /arrow /build && |
There was a problem hiding this comment.
I don't think you need to build Arrow C++ and PyArrow from scratch. You should be able to just pip install a recent binary wheel. Did you try to do that?
There was a problem hiding this comment.
Incidentally, this is what Go does for a similar purpose: https://github.com/apache/arrow/blob/master/ci/docker/debian-10-go-cgo-python.dockerfile#L23-L36
There was a problem hiding this comment.
@pitrou the reason was to do the integration test against the code in the main branch for cases where related fixes are made to the python/C++ implementation. It seems like this is what conda-python-jpype does. If undesired then we can switch to just pip install (which was @tomersolomon1's original intent, so my bad)
There was a problem hiding this comment.
As you prefer, though this certainly makes the CI and docker run slower than installing a binary.
There was a problem hiding this comment.
I think I prefer to keep it this way b/c of the reasons @roee88 mentioned, and also this lowers the risk of introducing new inconsistencies between the implementations.
|
@amol- This should interest you, especially the bridge here: https://github.com/apache/arrow/pull/11543/files#diff-386b108c1492f76426f3ea17399c681593a8cc86b34daeb1b451d4da600863e8R56 |
Yes, I was reading it, there are some useful helpers. Some of those should probably live in |
| def __init__(self): | ||
| self.allocator = jpype.JPackage( | ||
| "org").apache.arrow.memory.RootAllocator(sys.maxsize) | ||
| self.jc = jpype.JPackage("org").apache.arrow.c |
There was a problem hiding this comment.
Minor suggestion, I feel it might help us when we will come back to this codebase in 6 months to be a bit more explicit about when we are invoking Python things and when we are invoking Java things. Given that both of them appear as Python codebase as jpype is proxying Java calls too it's not as explicit as it could be.
Probably naming self.java_allocator and self.java_c would make a bit more obvious that we are dealing with Java methods there and not with Python things.
Signed-off-by: tomersolomon1 <[email protected]>
roee88
left a comment
There was a problem hiding this comment.
LGTM
Eventually this could reuse code from ARROW-14319 but there is no agreement on how a new jvm module should look like.
ci/scripts/java_cdata_integration.sh
Outdated
|
|
||
| python integration_tests.py | ||
|
|
||
| popd No newline at end of file |
There was a problem hiding this comment.
Formatting (missing new line at end of file)
|
|
||
| if __name__ == '__main__': | ||
| setup_jvm() | ||
| unittest.main(verbosity=2) No newline at end of file |
Signed-off-by: tomersolomon1 <[email protected]>
|
Benchmark runs are scheduled for baseline = 3c1f702 and contender = bb776d8. bb776d8 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
…entation for Java Integration tests for the C data Interface implementation for Java. Closes apache#11543 from tomersolomon1/java-c-data-interface-integration Lead-authored-by: tomersolomon1 <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Integration tests for the C data Interface implementation for Java.