GH-43680: [Integration] Unskip nanoarrow in IPC integration tests#43715
GH-43680: [Integration] Unskip nanoarrow in IPC integration tests#43715bkietz merged 6 commits intoapache:mainfrom
Conversation
pitrou
left a comment
There was a problem hiding this comment.
Looks like there are a lot of nanoarrow-related failures in the Integration build
8cce89d to
94dd868
Compare
|
Just a few notes from some investigating:
--- a/dev/archery/archery/integration/tester_nanoarrow.py
+++ b/dev/archery/archery/integration/tester_nanoarrow.py
@@ -56,8 +56,8 @@ class NanoarrowTester(Tester):
'JSON_PATH': json_path,
'COMMAND': command,
**{
- f'QUIRK_{q}': 1
- for q in quirks or []
+ f'QUIRK_{q}': "1"
+ for q in quirks or ""
},
|
|
de12a7a to
b855bb3
Compare
|
After apache/arrow-nanoarrow#627 and apache/arrow-nanoarrow#626, I believe all the integration tests targeting the monorepo implementations will pass. There are still failures for nanoarrow producing/nanoarrow consuming, but I think those are better solved there once this PR is merged since we have the CI job set up to pull from arrow main and build a fresh nanoarrow for each commit on a PR. # From a checkout of this PR
export gold_dir=$(pwd)/../arrow-testing/data/arrow-ipc-stream/integration
export ARROW_NANOARROW_PATH=$(pwd)/../arrow-nanoarrow/build
archery integration --with-nanoarrow 1 --run-ipc \
--gold-dirs=$gold_dir/0.14.1 \
--gold-dirs=$gold_dir/0.17.1 \
--gold-dirs=$gold_dir/1.0.0-bigendian \
--gold-dirs=$gold_dir/1.0.0-littleendian \
--gold-dirs=$gold_dir/2.0.0-compression \
--gold-dirs=$gold_dir/4.0.0-shareddict Details |
|
...and there are still Rust failures due to google/flatbuffers#8150 (originally reported as apache/arrow-rs#5052 ). |
|
This failure is due to nanoarrow's reliance on post-0.15 stream continuations and this one is because pre 1.0 unions had null bufffers, which adds one more buffer per union field than we expect |
Similar to #626 but for union type_id arrays. This should fix the two remaining failures in the integration tests between the IPC implementations in the Arrow repository and the nanoarrow IPC implementation apache/arrow#43715.
paleolimbot
left a comment
There was a problem hiding this comment.
Thank you!
There are still some remaining issues in nanoarrow--nanoarrow and nanoarrow--rust to sort out; however, I think it will be easier to sort those out with this merged (at least from nanoarrow's end).
| 'COMMAND': command, | ||
| **{ | ||
| f'QUIRK_{q}': "1" | ||
| for q in quirks or "" |
There was a problem hiding this comment.
Nit (Python strings are sequences, but still)
| for q in quirks or "" | |
| for q in quirks or () |
There was a problem hiding this comment.
👍 will merge after CI is green again
Co-authored-by: Antoine Pitrou <[email protected]>
|
CI failure unrelated. Merging |
|
nanoarrow / rust integration tests are failing now it seems: apache/arrow-rs#6448 Do we know if the underlying issue is tracked anywhere? |
|
I believe it's apache/arrow-rs#5052 ! |
Rationale for this change
Nanoarrow can now read and write IPC files as of apache/arrow-nanoarrow#585 so it should no longer be skipped as a producer/consumer
What changes are included in this PR?
Nanoarrow's tester is updated to point to the new integration executable and to report nanoarrow as a consumer/producer of IPC files.
Notably the
null_trivialcase is skipped even though nanoarrow nominally supports it since it represents a corner case in which nanoarrow's flatbuffers library will not accept some vectors produced by other flatbuffers libraries dvidelabs/flatcc#287Are these changes tested?
Yes
Are there any user-facing changes?
No