Skip to content

[DF] Display gets confused by friend columns, prints them twice #8450

@eguiraud

Description

@eguiraud

The following code:

#include <ROOT/RDataFrame.hxx>
#include <TFile.h>
#include <TTree.h>

int main() {
  ROOT::RDataFrame(32).Define("idx", "rdfentry_").Snapshot("t1", "f.root");

  ROOT::RDF::RSnapshotOptions opts;
  opts.fMode = "update";
  ROOT::RDataFrame("t1", "f.root")
      .Define("x", "idx*idx")
      .Snapshot("t2", "f.root", {"x", "idx"}, opts);

  TFile f("f.root");
  auto *tmain = f.Get<TTree>("t1");
  auto *tfriend = f.Get<TTree>("t2");
  tmain->AddFriend(tfriend);

  ROOT::RDataFrame(*tmain).Display(".*", 100)->Print();

  return 0;
}

prints

idx | t2.x | x   | t2.idx |
0   | 0    | 0   | 0      |
1   | 1    | 1   | 1      |
2   | 4    | 4   | 2      |
3   | 9    | 9   | 3      |
4   | 16   | 16  | 4      |
5   | 25   | 25  | 5      |

where idx and t2.idx are actually two different columns (one in the main tree and the other in the friend tree) but x and t2.x are two valid spellings of the same column, which should then be printed only once.

I think Display gets tripped up by the output of RLoopManager::GetColumnNames(), which reports the two valid spellings of x. It should instead use the same logic as Snapshot, which removes duplicates.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions