-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Improve performance of interleave_primitive (-15% - 45%) / interleave_bytes (-10-25%) #7420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mbutrovich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking! I wonder, is this screenshot from Compiler Explorer or a local tool? |
|
I also pushed an improvement for |
5c51c91 to
3850342
Compare
| let mut offsets = BufferBuilder::<T::Offset>::new(indices.len() + 1); | ||
| offsets.append(T::Offset::from_usize(0).unwrap()); | ||
| for (a, b) in indices { | ||
| let mut offsets = Vec::with_capacity(indices.len() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vec and extend generates better code.
There is probably some other places this pattern can be applied as well @mbutrovich
It's Beyond Compare but you can use your diff tool of choice with the two .txt dumps of the machine code. I use cargo-show-asm to generate the relevant snippets of code. For example, I don't find that Compiler Explorer generates representative code for real projects. Putting snippets in there often doesn't reflect what the compiler does with large projects with complex CFG DAGs, external crates, LTO, and inlined functions. |
| builder.append(v) | ||
| } | ||
| builder.finish() | ||
| let nulls = BooleanBuffer::collect_bool(indices.len(), |i| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pattern that comes up often as well
Ah nice, thanks for the overview! I'll try might try that in the future. |
If you know the name of the function or even part of it - you can specify all of it or parts as well: |
TY @pacak ! |


Which issue does this PR close?
Closes #7421
Closes #.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?