-
Notifications
You must be signed in to change notification settings - Fork 614
Datatype deduplication 2: switch to re_arrow2
#4883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
re_arrow2re_arrow2
re_arrow2re_arrow2
599a3c7 to
987dba4
Compare
d354ce7 to
ca6a06c
Compare
987dba4 to
cdc2f8f
Compare
Size changes
|
emilk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yay!
|
This gives us a 20% memory usage improvement in the best case as is (instead of 50% back in the day, because we've optimized things since then, most notably we don't have any arrow extensions anymore). We've basically reached the point where Trying to remove the stack overhead from |
ca6a06c to
516d5cc
Compare
All the grunt work left to get rid of polars. - Remove all helpers and APIs built specifically for polars' `DataFrame`. - Refactor tests that rely on dataframe joins to not require join semantics in the first place (`re_data_store` has no knowledge of those anyway). - The one test that does require join semantics has moved over to `re_query`, where join semantics belong. - All `polars-*` dep have been removed. Don't look at the commit log as it makes no sense: i changed strategies a bunch of times on the way. --- - Part of #4789 - DNR: requires #4856 --- Part of the tiny datatype deduplication PR series: - #4880 - #4883
Isn't the long-term solution to switch to |
I'd say that switching to Specifically,
But, there is still a lot of overhead when it comes to slicing data, e.g. a u32 slice is still |
cdc2f8f to
bbbd54f
Compare
Grunt work to switch to
re_arrow2and all the breaking changes that come with it.arrow2and get rid ofpolars#4789Part of the tiny datatype deduplication PR series:
re_arrow2#4883Checklist
mainbuild: app.rerun.ionightlybuild: app.rerun.io