support compression for IPC with revamped feature flags#2369
support compression for IPC with revamped feature flags#2369tustvold merged 38 commits intoapache:masterfrom
Conversation
liukun4515
left a comment
There was a problem hiding this comment.
Thanks for your effort and push this feature forward, and help me to refactor this model.
LGTM
It was my pleasure -- thank you for figuring out the tests and the format that was required |
Also we need to enable the IT test in the arrow. I have not thought about it carefully, but how to enable the |
|
This issue #2342 also can be closed. |
tustvold
left a comment
There was a problem hiding this comment.
Mostly just some minor nits
| } else if decompressed_length == LENGTH_NO_COMPRESSED_DATA { | ||
| // no compression | ||
| let data = &input[(LENGTH_OF_PREFIX_DATA as usize)..]; | ||
| Buffer::from(data) |
There was a problem hiding this comment.
Just an observation, but this copy is kind of unfortunate (although existed before)
There was a problem hiding this comment.
Are you thinking the alternate is to create a Buffer initially and write this code in terms of Buffer rather than &[u8]?
There was a problem hiding this comment.
Yeah... Not important for this PR, but it seems unfortunate the amount of memory copying we are doing, especially when the major design goal of the IPC spec is to avoid this 😅
| } | ||
|
|
||
| impl CompressionCodec { | ||
| #[allow(clippy::ptr_arg)] |
There was a problem hiding this comment.
I might be misunderstanding this lint, but I don't think it should be firing for this method
There was a problem hiding this comment.
When I remove it and run clippy like
cargo clippy -p arrow --features=prettyprint,csv,ipc,test_utils --all-targets -- -D warningsClippy tells me:
error: writing `&mut Vec` instead of `&mut [_]` involves a new object where a slice will do
--> arrow/src/ipc/compression/stub.rs:50:18
|
50 | _output: &mut Vec<u8>,
| ^^^^^^^^^^^^ help: change this to: `&mut [u8]`
|
= note: `-D clippy::ptr-arg` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg
Which while true in this case, is not true for the actual codec implementation, and they both need to have the same signature.
There was a problem hiding this comment.
That's a very daft lint, &mut Vec<u8> is not the same as &mut [u8], in particular the latter must initialize memory... I can understand &[u8] vs &Vec<u8> but the mutability makes a difference... I somewhat assumed clippy wasn't being silly, guess I was wrong 😅
| ); | ||
| )?; | ||
| } | ||
| // pad the tail of body data |
There was a problem hiding this comment.
I'm guessing this previously wasn't needed because the buffers were already 8 byte aligned
There was a problem hiding this comment.
Probably -- I am not sure to be honest. This is code that was in @liukun4515 's original PR but it seemed reasonable to me
arrow/src/ipc/writer.rs
Outdated
| let data_gen = IpcDataGenerator::default(); | ||
| let mut writer = BufWriter::new(writer); | ||
| // write magic to header | ||
| let mut header_size: usize = 0; |
There was a problem hiding this comment.
| let mut header_size: usize = 0; | |
| let header_size: usize = ARROW_MAGIC.len() + 2; |
?
There was a problem hiding this comment.
Good call -- I have updated in 2ed7ce3 (and added an assert to verify the currently implicit assumption that ARROW_MAGIC is 6 bytes long.
arrow/src/ipc/compression/stub.rs
Outdated
| // under the License. | ||
|
|
||
| //! Stubs that implement the same interface as ipc_compression | ||
| //! but always error. |
Co-authored-by: Kun Liu <[email protected]> Co-authored-by: Liang-Chi Hsieh <[email protected]> Co-authored-by: Raphael Taylor-Davies <[email protected]>
…into alamb/help_feature_flags
|
Benchmark runs are scheduled for baseline = 1c879ae and contender = 5e27d93. 5e27d93 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #1855
Closes liukun4515#1
Closes #1709
Closes #70
Closes #2342
Rationale for this change
Not this is a new PR is based on @liukun4515 's work in #1855. Github is showing a massive diff in liukun4515#1 so I wanted to get this PR up so we could see what the actual change is proposed
The logic is unchanged, but I changed how the feature flagging works so that the code differences are isolated into the
ipc_compressionmoduleWhat changes are included in this PR?
ipc_compressionfeature flagAre there any user-facing changes?
New feature flag, and support for compression