-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add RDataFrame::DefinePerSample #6745
Copy link
Copy link
Closed
Labels
Description
We have identified in previous meetings (see here and here) that a typical HEP analysis benefits from a Define version, which evaluates only once per "dataset". The identifier of a "dataset" is not yet clear. An example scenario is given below (event weights per sample, typical for simulated datasets):
// Construct RDF
RDataFrame df(tree, files);
// Declare computations
auto get_scale = [](const Identifier_t& dataset)
{
// dataset = filename.root/treename
if (dataset.contains("Data")) return 1.0;
else if (dataset.contains("DY")) return 0.9;
else if (dataset.contains("WJets")) return 1.1;
else throw std::runtime_error("Unknown dataset");
};
auto h = df.DefinePerSample("weight", get_scale)
.Histo1D("nMuon", "weight");
// Access result
h->Draw();Reactions are currently unavailable