-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Is your feature request related to a problem or challenge?
Similarly to #13759, we found another potential code vulnerability from a security audit performed by InfluxData
The public method
with_capacityofStringArrayBuilderaccepts a parameteritem_capacity: usize. It is used to create aMutableBufferwith a capacity calculated as(item_capacity + 1)*size_of::<i32>(), which can overflow undetected, leaving the buffer
too small. The subsequent unsafe call to push a value into the buffer can lead to an out-of-bounds memory access.In datafusion/functions/src/strings.rs, lines 125ff:
impl StringArrayBuilder {
pub fn with_capacity(item_capacity: usize, data_capacity: usize) -> Self {
let mut offsets_buffer = MutableBuffer::with_capacity((item_capacity + 1) * size_of::<i32>());
// SAFETY: the first offset value is definitely not going to exceed the bounds.
unsafe { offsets_buffer.push_unchecked(0_i32) };
Self {
offsets_buffer,
value_buffer:
MutableBuffer::with_capacity(data_capacity),
}
}I analyzed the potential risk, and I agree there is a risk of memory unsafety but I do not think it is exploitable via DataFusion APIs. Specifically, the only callsites are:
datafusion/datafusion/functions/src/string/concat_ws.rs
Lines 228 to 229 in 3ee9b3d
| let mut builder = StringArrayBuilder::with_capacity(len, data_size); |
The argument is taken from
datafusion/datafusion/functions/src/string/concat_ws.rs
Lines 149 to 150 in 3ee9b3d
| // Array | |
| let len = array_len.unwrap(); |
So to trigger this code you would have to provide an input record batch with more thanu32::MAX rows
Reproducer showing segfault:
Here is a test case I wrote:
andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ git diff
diff --git a/datafusion/functions/src/strings.rs b/datafusion/functions/src/strings.rs
index d2fb5d585..8e7c05fa1 100644
--- a/datafusion/functions/src/strings.rs
+++ b/datafusion/functions/src/strings.rs
@@ -422,3 +422,13 @@ impl ColumnarValueRef<'_> {
}
}
}
+
+
+#[cfg(test)]
+mod test {
+ use super::*;
+ #[test]
+ fn test_overflow() {
+ let builder = StringArrayBuilder::with_capacity(usize::MAX, usize::MAX);
+ }
+}It must be run in release mode
cargo test --release -p datafusion-functions --lib -- overflowerror: test failed, to rerun pass `-p datafusion-functions --lib`
Caused by:
process didn't exit successfully: `/Users/andrewlamb/Software/datafusion/target/release/deps/datafusion_functions-e3de98ad3d4dc27c overflow` (signal: 11, SIGSEGV: invalid memory reference)
andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$
Describe the solution you'd like
Recommendations: Check for integer overflow when calculating the capacity.
Describe alternatives you've considered
No response
Additional context
No response