Skip to content

Add several improvements to use Unuran with the DistSampler interface#8630

Merged
lmoneta merged 2 commits intoroot-project:masterfrom
lmoneta:unuran_improvements
Sep 2, 2021
Merged

Add several improvements to use Unuran with the DistSampler interface#8630
lmoneta merged 2 commits intoroot-project:masterfrom
lmoneta:unuran_improvements

Conversation

@lmoneta
Copy link
Copy Markdown
Member

@lmoneta lmoneta commented Jul 8, 2021

This Pull request adds several improvements to use UNU.RAN using the DistSampler interface
(through the TUnuranSampler class). This is to fix issue #7332 adding functionality as requested.

The major improvements are:
- Improvements in using directly the string API from DistSampler
- Better support for setting method and method options using the DistSamplerOptions class
- Support for setting dpdf, cdf and multi-sim mode in the DistSampler interface
- Add test program for all the new DistSampler functionality

This PR fixes #7332

@lmoneta lmoneta requested a review from bellenot as a code owner July 8, 2021 09:23
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@lmoneta lmoneta requested review from guitargeek and removed request for bellenot July 8, 2021 09:23
@lmoneta lmoneta force-pushed the unuran_improvements branch from c56966c to f4939bc Compare July 8, 2021 09:28
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-ubuntu16/nortcxxmod.
Running on sft-ubuntu-1604-4.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2021-07-08T09:45:59.357Z] FAILED: /usr/bin/ccache /usr/bin/c++ -DVECCORE_ENABLE_VC -Iginclude -I/mnt/build/workspace/root-pullrequests-build/root/core/base/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/foundation/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/cont/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/gui/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/meta/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/clib/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/rint/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/zip/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/thread/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/textinput/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/clingutils/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/base/v7/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/foundation/v7/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/unix/inc -I/mnt/build/workspace/root-pullrequests-build/root/math/mathcore/inc -I/mnt/build/workspace/root-pullrequests-build/root/math/mathcore/v7/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/imt/inc -I/mnt/build/workspace/root-pullrequests-build/root/core/multiproc/inc -Iexternals/mnt/build/workspace/root-pullrequests-build/install/include -I/mnt/build/workspace/root-pullrequests-build/root/math/minuit/inc -I/mnt/build/workspace/root-pullrequests-build/root/math/matrix/inc -Imath/unuran/unuran-1.8.0-root -Imath/unuran/unuran-1.8.0-root/src -Imath/unuran/unuran-1.8.0-root/src/utils -I/mnt/build/workspace/root-pullrequests-build/root/math/unuran/inc -I/mnt/build/workspace/root-pullrequests-build/root/graf2d/gpad/inc -I/mnt/build/workspace/root-pullrequests-build/root/hist/hist/inc -I/mnt/build/workspace/root-pullrequests-build/root/math/mathmore/inc -Imath/unuran/test -I/mnt/build/workspace/root-pullrequests-build/root/net/net/inc -I/mnt/build/workspace/root-pullrequests-build/root/io/io/inc -I/mnt/build/workspace/root-pullrequests-build/root/graf2d/graf/inc -I/mnt/build/workspace/root-pullrequests-build/root/test/unit_testing_support -isystem googletest-prefix/src/googletest/googletest/include -isystem googletest-prefix/src/googletest/googlemock/include -fdiagnostics-color=always -std=c++14 -pipe -Wshadow -Wall -W -Woverloaded-virtual -fsigned-char -pthread -O3 -std=c++14 -MD -MT math/unuran/test/CMakeFiles/testUnuranSampler.dir/unuranSampler.cxx.o -MF math/unuran/test/CMakeFiles/testUnuranSampler.dir/unuranSampler.cxx.o.d -o math/unuran/test/CMakeFiles/testUnuranSampler.dir/unuranSampler.cxx.o -c /mnt/build/workspace/root-pullrequests-build/root/math/unuran/test/unuranSampler.cxx
  • [2021-07-08T09:45:59.357Z] /mnt/build/workspace/root-pullrequests-build/root/math/unuran/test/unuranSampler.cxx:12:32: fatal error: ROOT/RMakeUnique.hxx: No such file or directory

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-performance-centos8-multicore/default.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Errors:

  • [2021-07-08T10:52:55.921Z] /data/sftnight/workspace/root-pullrequests-build/root/math/unuran/test/unuranSampler.cxx:12:10: fatal error: ROOT/RMakeUnique.hxx: No such file or directory

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2021-07-08T10:57:34.185Z] /home/sftnight/build/workspace/root-pullrequests-build/root/math/unuran/test/unuranSampler.cxx:12:10: fatal error: ROOT/RMakeUnique.hxx: No such file or directory

@phsft-bot
Copy link
Copy Markdown

Build failed on mac1014/python3.
Running on macitois21.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2021-07-08T11:17:35.869Z] FAILED: math/unuran/test/CMakeFiles/testUnuranSampler.dir/unuranSampler.cxx.o
  • [2021-07-08T11:17:36.437Z] /Volumes/HD2/build/workspace/root-pullrequests-build/root/math/unuran/test/unuranSampler.cxx:12:10: fatal error: 'ROOT/RMakeUnique.hxx' file not found

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on mac11.0/cxx17.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2021-07-08T14:43:12.767Z] FAILED: interpreter/cling/tools/plugins/clad/clad-prefix/src/clad-stamp/clad-download
  • [2021-07-08T14:43:12.767Z] CMake Error at clad-stamp/clad-download-Release.cmake:37 (message):
  • [2021-07-08T14:43:12.767Z] CMake Error at /Users/sftnight/build/workspace/root-pullrequests-build/build/interpreter/cling/tools/plugins/clad/clad-prefix/tmp/clad-gitclone.cmake:31 (message):
  • [2021-07-08T14:43:12.767Z] CMake Error at clad-stamp/clad-download-Release.cmake:47 (message):

}

/// Check if there is a parent distribution defined.
bool HasParentPdf() const { return fFunc != nullptr; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta Wouldn't return fFunc; be more elegant here?

Copy link
Copy Markdown
Contributor

@guitargeek guitargeek Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think the currently proposed fFunc != nullptr is more elegant because there is no implicit type conversion from pointer to bool. If you just return fFunc, it's less clear what type fFunc is when you only read this function.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added also a bool function becase there was already a ParentPdf() returning a reference, and it does not work whne fFunc is a nullptr


/**
sample one bin given an estimated of the pdf in the bin
Sample one bin given an estimated of the pdf in the bin.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta estimated -> estimate?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, thank you!

/**
generate a un-binned data sets (fill the given data set)
if dataset has already data append to it
Generate a un-binned data sets by fill the given data set.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta I think there should be "Generate an un-binned data set by filling the given data set."

<A href="http://statmath.wu-wien.ac.at/unuran/doc/unuran.html#Methods_005ffor_005fCONT">UnuRan doc</A>.
A re-initialization is needed whenever distribution parameters have been changed.
Note that the method string can contain in addition to the method name all the specific method
parameters specified using the UNURAN methiod string API.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta methiod -> method


/**
Return an information string about used unuran generator method.
@param extended : if true return some help about the possible options for the the method.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta the the -> the

double fMode; /// mode of dist (1D)
double fArea; /// area of dist
std::vector<double> fNDMode; /// mode of the multi-dim distribution
const ROOT::Math::IGenFunction * fFunc1D = nullptr; /// 1D function pointer (pdf)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta I am confused with the 1D in fFunc1D and the comments. Isn't it used for 1 and more dimensions? (The same for fDPDF

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those function objects will be used only for the 1D case. For the multi-dim case the function pointer stared in the base class DistSampler will be used

int fLevel; /// debug level
double fMode; /// mode of dist (1D)
double fArea; /// area of dist
std::vector<double> fNDMode; /// mode of the multi-dim distribution
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta Are Samplers streamed? In principle it seems to be more elegant to have just the vector of modes which may have a single element if set from the 1D setter. But I guess it could break some samplers stored in files.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoneta Sorry I didn't realise that the base class allows to set multi dim functions, while the TUnuranSampler provides the inteface for 1D functions.

ROOT::Fit::DataRange * fRange; // data range
const ROOT::Math::IMultiGenFunction * fFunc; // internal function (ND)
bool fOwnFunc; /// flag to indicate if the function is owned
mutable std::vector<double> fData; ///! internal array used to cached the sample data
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding, what is the point of having the fData member marked as transient with /! when the DistSampler doesn't have a ClassDef? If there is no ClassDef, doesn't this mean that the class is not serializable and it doesn't matter if a member is transient or not?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think I get it, it's because if you have a class deriving from it like the TUnuranSampler (which has a ClassDef introduced with this commit), it also needs to know which members of the base classes should be transient or not.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guitargeek a class can be streamed in ROOT also if it has not a ClassDef. ClassDef only defines the version of the class

Comment on lines +89 to +92
std::vector<std::string> names;
// start by named options
for (auto & e : fNamOpts)
names.push_back(e.first.c_str());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

      std::vector<std::string> names;
      names.reserve(fNamOpts);
      // start by named options
      for (auto const& e : fNamOpts) 
         names.emplace(e.first);
  • std::string can be directly copy constructed from other std::string without calling c_str()
  • auto const& is preferred over auto &
  • we know how many elements the output vector we have, so reserve can be used

Also the same the two other functions below.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion

EXPECT_EQ(ret, true);
if (!ret) return;

sampler->SetMode(0.75);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the Mode is set to 0.75 here, should it not be 3.0 because the mean of a normal distribution is also the mean?

Comment on lines +90 to +94
opt.SetAlgorithm("tdr");
// set specific options for the tdr method
opt.SetAlgoOption("c",0.);
opt.SetAlgoOption("cpoints",50);
opt.SetAlgoOption("usedars","true");
opt.SetAlgoOption("variant_gw","");
//opt.Print();
Copy link
Copy Markdown
Contributor

@guitargeek guitargeek Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test check if the algorithm options were correctly propagated to TUnuran? Or is that not so important to test explicitly?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes , this is the idea of this test. To check if one can pass the options. Real tests of the algorithms are provided in unuran itself and are outside of the scope of the wrapper classes

Copy link
Copy Markdown
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Lorenzo, thanks for the developments and in particular for implementing the nice test! I made a few comments, most are not explicit suggestions but questions.

And thanks @AntoniMarcinek for spotting the typos!

Before approving, there is one change I definitely want to request, which is squashing the 2nd and 3rd commit together. I think it's good that the 1st commit and the 2nd are separate because they are incremental developments, but the third commit is just a fixup to the second one and we don't need it in the commit history I think.

@AntoniMarcinek
Copy link
Copy Markdown

@guitargeek @lmoneta Do you have any expectation on when this would be merged?

@lmoneta lmoneta force-pushed the unuran_improvements branch from 15c58f6 to 2c6245f Compare September 1, 2021 09:52
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

lmoneta and others added 2 commits September 1, 2021 12:03
 - Add possibility to SetMode
 - Add possibility to Set DPDF and CDF

Add additional improvements to use UNU.RAN using the DistSampler interface.
This is to fix issue root-project#7332
The improvements are:

- Improvements in using directly the string API from DistSampler
- Better support for setting method and method options using DistSamplerOptions
- Complete the support for setting dpdf, cdf and multi-sim mode in DistSampler
- Add test program for all lthis new DistSampler functionality

remove RMakeUnique since now minimumal C++ support is c++14
@lmoneta lmoneta force-pushed the unuran_improvements branch from 2c6245f to e938269 Compare September 1, 2021 10:04
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@lmoneta
Copy link
Copy Markdown
Member Author

lmoneta commented Sep 1, 2021

@AntoniMarcinek sorry for the delay. I think it it is fine we can then merge it soon.
@guitargeek , I have applied the changes you suggested.
Thanks

Copy link
Copy Markdown
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the review comments!

@phsft-bot
Copy link
Copy Markdown

Build failed on mac11.0/cxx17.
Running on macphsft20.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@lmoneta lmoneta merged commit 4121b5f into root-project:master Sep 2, 2021
@lmoneta lmoneta deleted the unuran_improvements branch September 2, 2021 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improvements in TUnuran and ROOT::Math::DistSampler

5 participants