Skip to content

[RF] Test schema evolution of RooDataHist #8037

@hageboeck

Description

@hageboeck

Originally posted by @guitargeek in #7859 (comment)

Hi @hageboeck! For the schema evolution from the previous version, I just made a test myself where I wrote a RooDataHist to a file with ROOT 6.24 and read it back with ROOT master + this PR. Do you think I should implement a unit test for that? I didn't think this was necessary, after all the change is rather trivial (removal of _realVars class member).

I see that I never really replied to the above:
Yes, I think that kind of a test like this makes sense. Consider this:

  • stressRooFit has a reference file that reads in some RooFit objects. This kind of serves as a schema evolution test, but that's more by accident than by design.
  • The reference file evolves with ROOT, though! When you find that small things have to be changed, you recreate the reference file by running ./stressRooFit with the correct arguments.
  • Now you are not testing 6.2[02] schema evolution, but whatever was in master when you created that reference. This is not a good and stable test.
  • Here's a test suite that tests the I/O of categories across several ROOT versions. It's trivial to extend it by creating another reference file and adding it to the test suite:
    class RooCategoryIO : public testing::TestWithParam<const char*> {
    public:
    /* Test the reading of a simple mock category that has the states
    * one = 0
    * two = 1
    * three = 2
    * four = 3
    * The ranges "evens" and "odds" for even and odd state names are defined.
    * Now, we check that set ranges are read and written properly, and that
    * sharing of those ranges works even after reading back.
    * A mock file can be created as follows:
    RooCategory cat("cat", "a category")
    cat.defineType("one")
    cat.defineType("two")
    cat.defineType("three")
    cat.defineType("four")
    cat.addToRange("evens", "two,four")
    cat.addToRange("odds", "one,three")
    RooDataSet data("data", "a dataset with a category", RooArgSet(cat))
    data.fill()
    TFile outfile("/tmp/testCategories.root", "RECREATE")
    outfile.WriteObject(&cat, "catOrig")
    outfile.WriteObject(&data, "data")
    */
    void SetUp() override {
    TFile file(GetParam(), "READ");
    ASSERT_TRUE(file.IsOpen());
    file.GetObject("catOrig", cat);
    ASSERT_NE(cat, nullptr);
    file.GetObject("data", data);
    ASSERT_NE(data, nullptr);
    catFromDataset = dynamic_cast<RooCategory*>(data->get(0)->find("cat"));
    ASSERT_NE(catFromDataset, nullptr);
    }
    void TearDown() override {
    delete cat;
    delete data;
    }
    protected:
    enum State_t {one = 0, two = 1, three = 2, four = 3};
    RooCategory* cat{nullptr};
    RooDataSet* data{nullptr};
    RooCategory* catFromDataset{nullptr};
    };
    TEST_P(RooCategoryIO, ReadWithRanges) {
    for (RooCategory* theCat : {cat, catFromDataset}) {
    ASSERT_TRUE(theCat->hasRange("odds"));
    ASSERT_TRUE(theCat->hasRange("evens"));
    EXPECT_TRUE(theCat->isStateInRange("odds", one));
    EXPECT_TRUE(theCat->isStateInRange("odds", "three"));
    EXPECT_TRUE(theCat->isStateInRange("evens", two));
    EXPECT_TRUE(theCat->isStateInRange("evens", "four"));
    EXPECT_FALSE(theCat->isStateInRange("odds", "two"));
    EXPECT_FALSE(theCat->isStateInRange("odds", four));
    EXPECT_FALSE(theCat->isStateInRange("evens", "one"));
    EXPECT_FALSE(theCat->isStateInRange("evens", three));
    }
    }
    TEST_P(RooCategoryIO, TestThatRangesAreShared) {
    cat->addToRange("evens", "three");
    EXPECT_TRUE(catFromDataset->isStateInRange("evens", "three"));
    }
    INSTANTIATE_TEST_SUITE_P(IO_SchemaEvol, RooCategoryIO,
    testing::Values("categories_v620.root", "categories_v621.root", "categories_v622.root"));

So the answer is:
It would be preferable to have a similar test to what I pasted above for RooDataHist. A stub is already here for RooDataHistv4:

TEST(RooDataHist, ReadV4Legacy)
{
TFile v4Ref("dataHistv4_ref.root");
ASSERT_TRUE(v4Ref.IsOpen());
RooDataHist* legacy = nullptr;
v4Ref.GetObject("dataHist", legacy);
ASSERT_NE(legacy, nullptr);
RooDataHist& dataHist = *legacy;
constexpr double targetBinContent = 20.;
RooArgSet* legacyVals = dataHist.get(10)->snapshot();
EXPECT_EQ(static_cast<RooAbsReal*>(legacyVals->find("x"))->getVal(),
static_cast<RooAbsReal*>(dataHist.get(10)->find("x"))->getVal());
EXPECT_EQ(dataHist.numEntries(), 20);
EXPECT_EQ(dataHist.sumEntries(), 20 * targetBinContent);
static_cast<RooRealVar*>(legacyVals->find("x"))->setVal(0.);
dataHist.get(*legacyVals); // trigger side effect for weight below.
const double weight = dataHist.weight();
ASSERT_EQ(weight, targetBinContent);
const double targetError = sqrt(10*4.);
EXPECT_NEAR(dataHist.weightError(RooAbsData::Poisson), targetError, 1.5); // TODO What is the actual value?
{
double lo, hi;
dataHist.weightError(lo, hi, RooAbsData::Poisson);
EXPECT_LT(lo, hi);
EXPECT_NEAR(lo, weight - computePoissonLower(weight), 3.E-2);
EXPECT_NEAR(hi, computePoissonUpper(weight) - weight, 3.E-2);
}
EXPECT_NEAR(dataHist.weightError(RooAbsData::SumW2),
targetError, 1.E-14);
{
double lo, hi;
dataHist.weightError(lo, hi, RooAbsData::SumW2);
EXPECT_NEAR(lo, targetError, 1.E-14);
EXPECT_NEAR(lo, targetError, 1.E-14);
EXPECT_EQ(lo, hi);
}
RooArgSet* coordsAt10 = dataHist.get(10)->snapshot();
const double weightBin10 = dataHist.weight();
EXPECT_NEAR(static_cast<RooRealVar*>(coordsAt10->find("x"))->getVal(), 0.5, 1.E-1);
EXPECT_EQ(weight, weightBin10);
}

By making it a value-parametrised test as for the categories, you could instantiate it very quickly for all RooDataHist versions.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions