Skip to content

[DF] Fix Snapshot of RVecs read from a TTree#10268

Merged
eguiraud merged 2 commits intoroot-project:masterfrom
eguiraud:fix-10225
Apr 5, 2022
Merged

[DF] Fix Snapshot of RVecs read from a TTree#10268
eguiraud merged 2 commits intoroot-project:masterfrom
eguiraud:fix-10225

Conversation

@eguiraud
Copy link
Copy Markdown
Contributor

Before this commit, the logic in SnapshotHelper did not take into
account the case of RVecs that are being read from an input TTree
(that was not possible until v6.26). Because of how the logic was
structured, that meant that Snapshot was trying to write out those
RVecs as C-style arrays instead, obviously badly failing at it.

This patch tries to make the flow of this logic a bit more readable
and fixes the problem mentioned above by treating input branches
of RVec type the same way we already treat std::vectors.

This fixes #10225.

@pcanal the logic in this commit could be simplified if instead of "if it's not every other case, it must be a C-style array" I could do instead "if it's not a C-style array, then it must be one of the other cases". But I'm not sure how to check for "branch is a C-style array", see https://mattermost.web.cern.ch/root/pl/4n1qbh363tdfjc1hs3khe6y84o

@eguiraud eguiraud added this to the 6.26/02 milestone Mar 29, 2022
@eguiraud eguiraud requested review from Axel-Naumann and pcanal March 29, 2022 12:49
@eguiraud eguiraud self-assigned this Mar 29, 2022
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-ubuntu2004/soversion.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

@pcanal
Copy link
Copy Markdown
Member

pcanal commented Mar 29, 2022

TBranchElement *be = dynamic_cast<TBranchElement*>(branch_ptr);
bool isarray = (be && be->GetInfo()->GetElement(be->GetID())->GetArrayDim() > 0);

and

if (!be) // && and !dynamic_cast<TBranchObject*>(branch_ptr)
{
    bool isarray = branch->GetListOfLeaves()->GetEntries() == 1 && (branch->GetListOfLeaves()->At(0)->GetLeafCount() != nullptr || branch->GetListOfLeaves()->At(0)->GetLenStatic() > 1);
}

and for TBranchObject, we probably need to look at TLeafObject::GetTypeName (or not support it)

@eguiraud
Copy link
Copy Markdown
Contributor Author

Ok, that seems more complex than the current logic 😅

eguiraud added 2 commits April 1, 2022 09:11
Until v6.26, this could never happen: RVecs were written out as
std::vectors. Now we write RVecs as RVecs, and we want to be able
not only to read them back, but to correctly wire the output
branches of a Snapshotted tree to RVec input branches.
Before this commit, the logic in SnapshotHelper did not take into
account the case of RVecs that are being read from an input TTree
(that was not possible until v6.26). Because of how the logic was
structured, that meant that Snapshot was trying to write out those
RVecs as C-style arrays instead, obviously badly failing at it.

This patch tries to make the flow of this logic a bit more readable
and fixes the problem mentioned above by treating input branches
of RVec type the same way we already treat std::vectors.

This fixes root-project#10225.
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

Copy link
Copy Markdown
Member

@Axel-Naumann Axel-Naumann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@pcanal pcanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also have (add if not there already) a test of what happens if trying to read other STL collection into a RVec.

@eguiraud
Copy link
Copy Markdown
Contributor Author

eguiraud commented Apr 5, 2022

@pcanal I quickly tried with std::array<int, 4> but it looks like TTreeReaderArray has trouble with that type:

#include <ROOT/RDataFrame.hxx>
#include <array>

int foo() {
  {
    std::array<int, 4> a{1, 2, 3, 4};
    TFile f("f.root", "recreate");
    TTree t("t", "t");
    t.Branch("a", &a);
    t.Fill();
    t.Write();
  }

  auto c = ROOT::RDataFrame("t", "f.root")
               .Filter([](std::array<int, 4> &a) { return a[0] == 1; }, {"a"})
               .Filter(
                   [](ROOT::RVecI &v) {
                     return v.size() == 4 && v[0] == 1 && v[3] == 4;
                   },
                   {"a"})
               .Count()
               .GetValue();
  return c;
}

yields

/tmp root -l -b -q foo.cpp+

Processing foo.cpp+...
Info in <TUnixSystem::ACLiC>: creating shared library /tmp/./foo_cpp.so
Error in <TTreeReaderValueBase::GetBranchDataType()>: Must use TTreeReaderArray to read branch a: it contains an array or a collection.
Error in <TTreeReaderValueBase::CreateProxy()>: The branch a contains data of type {UNDETERMINED TYPE}, which does not have a dictionary.
terminate called after throwing an instance of 'std::runtime_error'
  what():  An error was encountered while processing the data. TTreeReader status code is: 6

I'll put it on my to-do list to investigate further.

As I mentioned above the safest way to fix this would be to switch from "if this is not a Define, or a std::vector, or an RVec, or a TClonesArray, then it must be a C-style array and it needs special care" to "if this is not a C-style array, then no special care needed", but even with your help I could not figure out how to code "is this a C-style array?".

@eguiraud eguiraud merged commit 12bd187 into root-project:master Apr 5, 2022
@eguiraud eguiraud deleted the fix-10225 branch April 5, 2022 09:11
@eguiraud
Copy link
Copy Markdown
Contributor Author

eguiraud commented Jun 7, 2023

Finally got around to check this and the situation is pretty much ok.

std::vector, std::array and std::deque branches can all be written out from a Snapshot correctly as those same types. Only std::vector can be written out as an RVec as well, with Snapshot doing the conversion on the fly.

There are a few warts in error reporting for unsupported cases but that's another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TChain doesn't recognise ROOT::VecOps::RVec<double> as a column type

4 participants