Skip to content

[RF] HistFactory PreprocessFunction::PrintXML() needs to escape special characters to produce valid XML #10840

@TomasDado

Description

@TomasDado
  • Checked for duplicates

Describe the bug

Calling RooStats::HistFactory::MakeModelAndMeasurementFast(meas) leads to a seg violation in root 6.26.04 with gcc11 on CentOS7 machine (lxplus), while the same code worked with 6.24.06 with gcc8 on CentOS7 machine.
This is the relevant stack trace:

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f43c411c60c in waitpid () from /lib64/libc.so.6
#1  0x00007f43c4099f62 in do_system () from /lib64/libc.so.6
#2  0x00007f43c6fe5a5c in TUnixSystem::StackTrace() () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libCore.so
#3  0x00007f43c6fe3155 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libCore.so
#4  <signal handler called>
#5  0x00007f43c481adef in __dynamic_cast () from /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-8a51a/x86_64-centos7/lib64/libstdc++.so.6
#6  0x00007f43c51b2ff1 in RooAbsCollection::insert(RooAbsArg*) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libRooFitCore.so
#7  0x00007f43c51b83db in RooAbsCollection::RooAbsCollection(RooAbsCollection const&, char const*) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libRooFitCore.so
#8  0x00007f43c52480f9 in RooArgList::RooArgList(RooArgList const&, char const*) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libRooFitCore.so
#9  0x00007f43c573c6f5 in RooStats::HistFactory::HistoToWorkspaceFactoryFast::createStatConstraintTerms(RooWorkspace*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, ParamHistFunc&, TH1 const*, RooStats::HistFactory::Constraint::Type, double) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libHistFactory.so
#10 0x00007f43c57409d7 in RooStats::HistFactory::HistoToWorkspaceFactoryFast::MakeSingleChannelWorkspace(RooStats::HistFactory::Measurement&, RooStats::HistFactory::Channel&) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libHistFactory.so
#11 0x00007f43c5746984 in RooStats::HistFactory::HistoToWorkspaceFactoryFast::MakeSingleChannelModel(RooStats::HistFactory::Measurement&, RooStats::HistFactory::Channel&) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libHistFactory.so
#12 0x00007f43c574f3d9 in RooStats::HistFactory::MakeModelAndMeasurementFast(RooStats::HistFactory::Measurement&) () from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/lib/libHistFactory.so

Note that this happens only for some setups, so I tried to debug it a bit and see what is special about the setups that crash. I will briefly describe what I found out:

We use AddPreprocessFunction() method to reparametrise our likelihood, this generally seems to work, but for some special cases this seems to cause problems. I created the XML files for the problematic setups and tried to use hist2workspace to try to identify the issues. I found out that the executable cannot read the XML files produced if it contains & and < or > as these are not valid XML characters and need to be escaped. So e.g. if the top level XML contains function like this:

<Function Name="morph_topWidth_0.700000" Expression="(0.+(((topWidth-0.700000)>=0)&&(fabs(topWidth-0.700000)<0.300000))*(1.-(fabs(topWidth-0.700000))/0.300000))" Dependents="topWidth[0.700000,0.700000,2.000000]" />

The hist2workspace executable will complain that it cannot parse the XML. I tried replacing the problematic characters with the XML replacements like &gt; etc. This makes the hist2workspace code run and it generates the workspace. However, if I try to replace the characters in the same way for the string passed to AddPrepropcessFunction, it doesnt parse the string and complains:

input_line_48:2:89: error: use of undeclared identifier 'gt'
Double_t TFormula____id12625155560414669645(Double_t *x){ return (0.+(((x[0]-0.700000)&{gt;}=0)&#38;&#38;(fabs(x[0]-0.700000)&{lt;0.300000}))*(1.-(fabs(x[0]-0.700000))/0.300000)) ; }
                                                                                        ^

I am not 100% sure these two issue - crash in the MakeModelAndMeasurementFast code and the XML parsing are related, but I think they could be.

Expected behavior

The code should not crash when calling MakeModelAndMeasurementFast as was the case in ROOT 6.24.06. The XML generated should be a valid XML that can be processed with histo2workspace.

To Reproduce

Adding any Expression to the top level XML file that contains characters like &, < breaks histo2workpace and it seems that using formuale containing these characters in AddPrepropcessFunction crashes MakeModelAndMeasurementFast.

Setup

  1. ROOT 6.26.04, but the histo2workspace issue not being able to read ther XML file is there also (at least) in 6.24.06
  2. CentOS7 (lxplus), tested with gcc8 and gcc11
  3. ROOT setup from LCG release, 101 (6.24.06) and 102 (6.26.04)

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions