ARROW-9848: [Rust] Implement 0.15 IPC alignment#8174
ARROW-9848: [Rust] Implement 0.15 IPC alignment#8174nevi-me wants to merge 2 commits intoapache:masterfrom
Conversation
|
@maxburke this will likely affect you, please have a look at it before the next release if you can. The Rust writer wasn't writing continuation markers consistently, and this is fixed. Readers should be able to continue reading, but it's worth testing out with your workloads. |
ed4bb31 to
7eee08a
Compare
| @@ -36,3 +36,4 @@ pub use self::gen::SparseTensor::*; | |||
| pub use self::gen::Tensor::*; | |||
|
|
|||
| static ARROW_MAGIC: [u8; 6] = [b'A', b'R', b'R', b'O', b'W', b'1']; | |||
There was a problem hiding this comment.
is const better than static here? IIRC I used static for ARROW_MAGIC when const either wasn't a thing yet, or I was unfamiliar with it.
There was a problem hiding this comment.
Yea, I think this can be const.
|
I'll look into the integration failures during the week, but it's C++ null arrays. I noticed similar failures when I was trying to read a null array from pyarrow. Seems I'm likely not understanding the spec correctly (odd that C++ is the only failure when Java and others pass). |
|
In C++ we are more hardened against metadata integrity issues from all the IPC fuzzing, e.g. According to the Flatbuffers spec, this field should be length-0, not null. |
|
Thanks Wes, from looking at the cpp source where the error occurs, the 0-length case makes me wonder if this might be a fbs issue in Rust. I'll have a look at whether the Rust impl distinguishes between a null and empty array value. |
Children should not be None/null, but should be an empty list
|
@wesm I found the problem, I was setting children as null instead of an empty list, so not a library issue. Thanks again |
|
@andygrove @paddyhoran @sunchao the changes here still work with the 0.14.x golden files, and 0.15+ as integration tests pass. Would you like to perform an in-depth review, or can I merge the changes? I have a follow-up that enables more tests, which I'd like to try wrap up over/before the weekend |
paddyhoran
left a comment
There was a problem hiding this comment.
I didn't do an in-depth review but from a high level this looks good. Let's merge this and keep the momentum going. Great job @nevi-me
| @@ -36,3 +36,4 @@ pub use self::gen::SparseTensor::*; | |||
| pub use self::gen::Tensor::*; | |||
|
|
|||
| static ARROW_MAGIC: [u8; 6] = [b'A', b'R', b'R', b'O', b'W', b'1']; | |||
There was a problem hiding this comment.
Yea, I think this can be const.
Thanks for the heads-up, @nevi-me, we'll give it a solid runthrough the next time we pull. |
Changes introduced in 0.15.0 changed the buffer alignment by adding a continuation marker to messages.
The previous behaviour was then marked as legacy, while both worked under V4 of the IPC metadata version.
This change catches the Rust implementation up to other languages, and has the consequence that more integration tests now pass.
The change is applied on top of clippy changes, and can be reviewed/merged after the relevant PRs.