Skip to content

Comments

Mark BufferBuilder::new_from_buffer as unsafe#9292

Merged
alamb merged 1 commit intoapache:mainfrom
Jefffrey:make-new-from-buffer-unsafe
Jan 29, 2026
Merged

Mark BufferBuilder::new_from_buffer as unsafe#9292
alamb merged 1 commit intoapache:mainfrom
Jefffrey:make-new-from-buffer-unsafe

Conversation

@Jefffrey
Copy link
Contributor

@Jefffrey Jefffrey commented Jan 28, 2026

Which issue does this PR close?

Rationale for this change

As identified by the issue, it is possible to use safe APIs to produce an unaligned buffer in BufferBuilder.

What changes are included in this PR?

Mark BufferBuilder::new_from_buffer as unsafe since there are alignment invariants that must be upheld to use it correctly.

Also fix some docstrings.

Are these changes tested?

Marking API unsafe only.

Are there any user-facing changes?

Yes.

@Jefffrey Jefffrey added the api-change Changes to the arrow API label Jan 28, 2026
@github-actions github-actions bot added arrow Changes to the arrow crate and removed api-change Changes to the arrow API labels Jan 28, 2026
@Jefffrey Jefffrey marked this pull request as ready for review January 28, 2026 11:13
Comment on lines +98 to +101
/// # Safety
///
/// - `buffer` bytes must be aligned to type `T`
pub unsafe fn new_from_buffer(buffer: MutableBuffer) -> Self {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main change here

Comment on lines -234 to +236
///
/// let mut builder = BufferBuilder::<u32>::new(10);
/// builder.append_n_zeroed(3);
///
/// assert_eq!(builder.len(), 3);
/// assert_eq!(builder.as_slice(), &[0, 0, 0])
/// ```
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by cleanup of these empty lines in the docstring, see how they currently look like:

Image

Also fix missing closing block syntax here.

Comment on lines +398 to +401
let buffer = MutableBuffer::from(value);
// SAFETY
// - buffer is aligned to T
unsafe { Self::new_from_buffer(buffer) }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only usage of this API in our codebase, which already upholds the invariants

@Jefffrey
Copy link
Contributor Author

I wouldn't mind making this API private if thats preferred (since its only used in one place) 🤔

@Jefffrey Jefffrey added the api-change Changes to the arrow API label Jan 28, 2026
@Jefffrey Jefffrey requested review from alamb and viirya January 28, 2026 11:20
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thanks @Jefffrey and @yilin0518

_marker: PhantomData<T>,
}

impl<T: ArrowNativeType> BufferBuilder<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have discovered over time that BufferBuilder is typically not as performant as using Vec -- but maybe that is a comment for a different day

@alamb
Copy link
Contributor

alamb commented Jan 28, 2026

fyi @viirya

///
/// ```
/// # use arrow_buffer::builder::BufferBuilder;
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I saw your comment now.

@alamb alamb merged commit 2c0eba4 into apache:main Jan 29, 2026
27 of 28 checks passed
@alamb
Copy link
Contributor

alamb commented Jan 29, 2026

Thanks @viirya and @Jefffrey

@Jefffrey Jefffrey deleted the make-new-from-buffer-unsafe branch January 29, 2026 23:13
@Jefffrey
Copy link
Contributor Author

Thanks @yilin0518 for reporting this

@alamb
Copy link
Contributor

alamb commented Jan 30, 2026

I think we should backport this to the 57 release

The only thing I worry about is that it is technically a breaking API change

@Jefffrey
Copy link
Contributor Author

I think breaking changes are fine if its for the sake of fixing unsound behaviour 🤔

@alamb
Copy link
Contributor

alamb commented Jan 30, 2026

I will plan to back port it then

alamb pushed a commit to alamb/arrow-rs that referenced this pull request Jan 30, 2026
# Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax.
-->

- Closes apache#9287

# Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

As identified by the issue, it is possible to use safe APIs to produce
an unaligned buffer in `BufferBuilder`.

# What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Mark `BufferBuilder::new_from_buffer` as unsafe since there are
alignment invariants that must be upheld to use it correctly.

Also fix some docstrings.

# Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Marking API unsafe only.

# Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.

If there are any breaking changes to public APIs, please call them out.
-->

Yes.
@alamb
Copy link
Contributor

alamb commented Jan 30, 2026

alamb added a commit that referenced this pull request Feb 2, 2026
…) (#9312)

- Part of #9240
- Related to #9287

This is a backport of the following PR to the 57 line
- #9292 from @Jefffrey 

Note this is technically an API change (if this API is used by
downstream crates they will have to mark the callsites with `unsafe`).
However per a conversation with @Jefffrey
#9292 (comment) and
our general focus on safety, it seems better to err on the safety side

> I think breaking changes are fine if its for the sake of fixing
unsound behaviour 🤔

Co-authored-by: Jeffrey Vo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change Changes to the arrow API arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

arrow-buffer: Potential Undefined Behavior Reported by Miri

3 participants