How to access RNTuple in a BulkIO style#7112
How to access RNTuple in a BulkIO style#7112jpivarski wants to merge 2 commits intoroot-project:v6-22-00-patchesfrom jpivarski:jpivarski/vCHEP-2021-studies
Conversation
|
Can one of the admins verify this patch? |
|
@jpivarski Many thanks for sharing, that's very helpful! The cluster and page sizes should be addressed in #7853 The |
jblomer
left a comment
There was a problem hiding this comment.
(should have left the regular comment as a review comment)
|
That would be a good interface. There would always be a practical limit on the number of items that one could view, but suppose we just don't want to specify it (and risk running out of memory in extreme cases). Can Also, who owns the Might the |
|
@oshadura should be on this thread, too. |
|
I think we would not need |
|
It seems like this discussion has concluded. Feel free to open a new PR to ROOT |
This PR is not intended to be merged into ROOT! That's why it's a draft!
The purpose of this PR is to show which private members I had to make public to access RNTuple in a BulkIO style.
Two of these changes were just to parameterize the cluster and page sizes:
fClusterSizeEntrieswas made public so that I could set it and make it apples-to-apples with the other formats.kDefaultElementsPerPage = 2097152is large, but 8× less than the maximum size that can be compressed. The maximum is0xffffffbecause the header provides 3 bytes to specify the uncompressed size, so that uncompressed size can't exceed that. The number I chose here is2**21, which is 8× below that limit, to allow for 8-byte integers and floating point numbers. What's probably missing here is the logic for splitting the data to be compressed into a series of blocks with this maximum size. (TTree and normal serialized objects do that.)The rest of the changes are just turning private/protected members into public ones so that they can be read directly in a BulkIO style. Here's how that's done: suppose you're filling a buffer named
arrayusing aviewof typeVreturned byGetViewCollectionorGetView<T>. We know thelengthof elements to read, so the function isHere's a sample usage: