-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[DF] Distributed RDataFrame doesn't handle friend trees correctly #7584
Copy link
Copy link
Closed
Description
- Checked for duplicates
Describe the bug
Distributed RDataFrame has support for friend trees, but it seems something is missing. See the gist at https://gist.github.com/vepadulano/b42343bff7297958c46675577bce46a9 :
- Two RDF are created and one column is filled
- They are both snapshotted to disk and merged into a single file through
hadd - A TChain is created with one column from the merged file and a friend TChain is attached to it with the second column from the merged file
- A distributed RDataFrame with spark is created using the TChain as input, then one histogram per column is booked and drawn to a canvas
- The operation fails with
TypeError: Template method resolution failed: none of the 4 overloaded methods succeeded. Full details: ROOT::RDF::RResultPtr<TH1D> ROOT::RDF::RInterface<ROOT::Detail::RDF::RRange<ROOT::Detail::RDF::RLoopManager>,void>::Histo1D(experimental::basic_string_view<char,char_traits<char> > vName) => runtime_error: Unknown column: myfriend.rnd
Expected behavior
The program should not fail, in fact substituting the distributed rdataframe object with a plain rdataframe gives the correct output image
To Reproduce
- Source an environment with ROOT master
- download the linked gist
python friendtrees_spark.py
Setup
Fedora 32
ROOT version: master
Built from source
Additional context
Thanks to @Zeguivert for originally reporting this issue
Reactions are currently unavailable