Skip to content

[DF] Wrong regex substitution when generating code to jit #11002

@eguiraud

Description

@eguiraud

A reproducer:

#include <ROOT/RDataFrame.hxx>

int main() {
  {
    auto df = ROOT::RDataFrame(10).Define("x", [] { return 42; });
    df.Snapshot("t", "f.root");
    df.Snapshot("fr", "fr.root");
  }

  TFile f("f.root");
  auto *t = f.Get<TTree>("t");
  TFile frf("fr.root");
  auto *fr = frf.Get<TTree>("fr");
  t->AddFriend(fr);
  ROOT::RDataFrame df(*t);
  df.Filter("x > 0 && fr.x > 0").Count().GetValue();
}

errors out with:

input_line_32:2:67: error: use of undeclared identifier 'fr'
auto func0(const Int_t var0, const Int_t var1){return var0 > 0 && fr.var0 > 0
                                                                  ^

The reason is that in this case we substitute column names with var0, var1 placeholder names starting with "x", resulting in the broken expression with fr.var0.

I think a possible fix is to perform these substitutions from the longest to the shortest column names.

First reported at https://root-forum.cern.ch/t/rdataframe-string-filter-question/50872 .

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions