-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Which part is this question about
Related to the continuation of StringView/BinaryView support in #6163
In arrow_data, the ByteView type is used to encapsulate this structure from the spec:
Notably, the spec dictates that all of these values must be signed integers. However, we're using u32.
The arrow-rs builder for GenericByteViewArray doesn't seem to have any range checks on the block, offset and len values for the view structure, which means, I think, you can happily construct a StringView array with arrow-rs, and then attempt to pass it to PyArrow or Java over IPC and have it fail at runtime.
Describe your question
Should we either
- be using i32 instead of u32 internally
- be adding constraints on the builder methods to ensure that we don't allow adding strings > 2GB
- Has someone noticed this before and addressed it and it's not actually a problem
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested
