Skip to content

HybridVec: a drop-in replacement for Vec that uses stack storage for small vectors#100

Closed
abonander wants to merge 2 commits intoGankra:masterfrom
abonander:hybrid_vec
Closed

HybridVec: a drop-in replacement for Vec that uses stack storage for small vectors#100
abonander wants to merge 2 commits intoGankra:masterfrom
abonander:hybrid_vec

Conversation

@abonander
Copy link
Copy Markdown
Contributor

Inspired by @gankro's comment in the thread about SSO on discuss.rust-lang.org.

This is a drop-in replacement for Vec that dynamically moves between stack and heap for smaller vectors. The threshold is currently governed by two hardcoded constants, the size of the on-stack array and the max number of bytes in the contained type, but perhaps in the future it could be configurable at compile-time or even calculated from a percentage of the running stack size. I didn't know enough about DST to implement a dynamically-sized array so the on-stack array is a constant size.

The file is huge mainly because I copied all the tests from vec.rs to make sure it was up to snuff. In fact, a lot of code was shamelessly copied from vec.rs and adapted to HybridVec, mostly with respect to heap storage management.

I had to comment out the assertions on capacity for smaller vectors because switching to stack storage broke them (HybridVec::new().capacity() != 0) and I didn't want to expose the ARRAY_SIZE constant for the tests submodule or copy it as a magic number for each occurrence.

Some methods, such as map_in_place, dedup and vec::as_vec, (and their associated tests) were omitted on purpose to limit complexity while others were omitted accidentally because Vec's API is massive and the tests didn't complain. Please let me know of any that should be there but aren't.

I made sure all tests passed before submitting this PR. I also ran benchmarks, though I'm not exactly sure if I'm interpreting them correctly because they seem to show regressions when the vector is on-stack: Pastebin

Of course, this could just be overhead. I expect some from the widespread use of dynamic dispatch required by this implementation strategy. A real comparison should compare HybridVec to Vec for vectors of the same size to emphasize any performance improvements/regressions. Additionally, some methods have a more naiive implementation than their Vec ancestors for simplicity's sake and should be benchmarked for performance regressions.

This is very much still a work-in-progress, though I feel it's mature enough already to be included as a working prototype so it can cut its teeth in the wild.

In addition to missing needed methods and benchmarks, I neglected to write doc-comments for most of the methods because the public API behaves, for the most part, equivalently to Vec's, with the exception of shrink_to_fit which has a new move_to_stack: bool parameter, which was needed to make into_boxed_slice() work correctly for smaller vectors.

The stack-vs-heap behavior happens transparently and is never exposed to the user except for this parameter.

Comment thread src/proto/hybrid_vec.rs Outdated
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably choose Vec::push()'s behavior for heap-resizing and then use this for stack-resizing (effectively a no-op if the new capacity is within the bounds of the on-stack array).

And by "weird" I meant that it reallocates right in the method instead of going through Vec's existing reallocation API. I get that it was supposed to double the size of the allocation every time it was necessary, though I think that could have been accomplished with self.reserve(self.capacity()).

@Gankra
Copy link
Copy Markdown
Owner

Gankra commented Jan 27, 2015

My gut feeling is that it's too soon for this type to exist. We don't have the type-system necessary for it to be really right.

@abonander
Copy link
Copy Markdown
Contributor Author

I do consider it a work-in-progress prototype, though I don't understand what it needs from the type system to be "really right". Making the array size a type parameter, you mean?

@Gankra
Copy link
Copy Markdown
Owner

Gankra commented Jan 27, 2015

The array size as part of the type, and the manually-drop RFC huon linked are both highly desirable for this functionality, yes.

@abonander
Copy link
Copy Markdown
Contributor Author

But it works as-is. It seems to me that those can be implemented on this type without breaking its behavior as they become available.

Comment thread src/proto/hybrid_vec.rs Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this type can just be a struct that uses two pointers like this, there's no need to distinguish stack and heap here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not. I'll fix that later.

@abonander abonander force-pushed the hybrid_vec branch 4 times, most recently from d4f0712 to 47f7025 Compare February 24, 2015 11:33
@abonander
Copy link
Copy Markdown
Contributor Author

@gankro @reem @huonw I got busy with life stuff and this kind-off fell off the radar. But I have finally updated it with the tweaks we discussed and brought it up to speed with the latest Rust.

However, I'm disappointed that my implementation appears to lose out to Vec in microbenchmarks:

test proto::hybrid_vec::tests::bench_small_hybrid                 ... bench:        37 ns/iter (+/- 1)
test proto::hybrid_vec::tests::bench_small_vec                    ... bench:        18 ns/iter (+/- 1)

The benchmark is just a naive let _: HybridVec<_> = (0usize .. ARRAY_SIZE).collect(). I have some ideas on where it might be chugging but I need to profile it first.

Also plenty of TODOs for doc-comments, though the API is 99.99% equivalent to Vec for the same-named methods anyways.

@Gankra
Copy link
Copy Markdown
Owner

Gankra commented Mar 18, 2015

collect-rs is being broken up into several distinct crates hosted at github.com/contain-rs

Still working out how third parties can submit crates for us to help review/maintain.

@Gankra Gankra closed this Mar 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants