Use a column to store categories, rather than a mapping#69
Use a column to store categories, rather than a mapping#69rgommers merged 5 commits intodata-apis:mainfrom
Conversation
|
We briefly touched on this PR today, we'd like to move it forward since all interchange protocol implementers seem to be in favor of this. @vnlitvinov plans to revisit his proposal from above and add that here or open an alternative PR. Then we can try to finalize this, and then update implementations in the various libraries for it. |
I'm not sure I quite follow; the |
91ed593 to
902ba7c
Compare
|
@shwina I've missed your comment, so I went ahead with the original plan we discussed last meeting and I had added a commit doing so. If you don't like the idea feel free to remove my commit. As for it not being a mapping... it somewhat is :) it maps an integer index to a category value. |
|
After discussion on Thursday, @shwina and others were happy with the changes @vnlitvinov pushed. With one change to make: rename |
Signed-off-by: Vasily Litvinov <[email protected]>
Co-authored-by: Keith Kraus <[email protected]>
This change was made in the spec part, but not the rest of this PR. I'm inclined to push a change to |
I tried this - since the tests no longer pass with this PR, and we now have an actual Pandas implementation, it may be better to completely remove this early prototype in |
Discussed in today's call: everyone is happy with deleting this prototype code. |
Now that we have four real-world implementations in cuDF, Vaex, Modin and Pandas, we no longer need this prototype. Having it here makes it harder to merge other PRs (see data-apisgh-69), so let's remove it now.
rgommers
left a comment
There was a problem hiding this comment.
Everyone is happy and this is now a pretty small/straightforward diff. So merging. I'll follow up with all the implementers to ensure we will actually propagate these changes to all implementations.
Thanks @shwina, @vnlitvinov, @kkraus14
This PR implements one of the changes mentioned in #41, i.e.,
Replacing the previous
mappingwith a child column now implies that the data buffer stores integer indices into the child column.