Add several improvements to use Unuran with the DistSampler interface#8630
Add several improvements to use Unuran with the DistSampler interface#8630lmoneta merged 2 commits intoroot-project:masterfrom
Conversation
|
Starting build on |
c56966c to
f4939bc
Compare
|
Starting build on |
|
Build failed on ROOT-ubuntu16/nortcxxmod. Errors:
|
|
Build failed on ROOT-performance-centos8-multicore/default. Errors:
|
|
Build failed on ROOT-debian10-i386/cxx14. Errors:
|
|
Build failed on mac1014/python3. Errors:
|
|
Starting build on |
|
Build failed on ROOT-debian10-i386/cxx14. Failing tests: |
|
Build failed on mac11.0/cxx17. Errors:
|
| } | ||
|
|
||
| /// Check if there is a parent distribution defined. | ||
| bool HasParentPdf() const { return fFunc != nullptr; } |
There was a problem hiding this comment.
@lmoneta Wouldn't return fFunc; be more elegant here?
There was a problem hiding this comment.
Personally I think the currently proposed fFunc != nullptr is more elegant because there is no implicit type conversion from pointer to bool. If you just return fFunc, it's less clear what type fFunc is when you only read this function.
There was a problem hiding this comment.
I added also a bool function becase there was already a ParentPdf() returning a reference, and it does not work whne fFunc is a nullptr
math/mathcore/inc/Math/DistSampler.h
Outdated
|
|
||
| /** | ||
| sample one bin given an estimated of the pdf in the bin | ||
| Sample one bin given an estimated of the pdf in the bin. |
There was a problem hiding this comment.
@lmoneta estimated -> estimate?
math/mathcore/inc/Math/DistSampler.h
Outdated
| /** | ||
| generate a un-binned data sets (fill the given data set) | ||
| if dataset has already data append to it | ||
| Generate a un-binned data sets by fill the given data set. |
There was a problem hiding this comment.
@lmoneta I think there should be "Generate an un-binned data set by filling the given data set."
math/unuran/inc/TUnuran.h
Outdated
| <A href="http://statmath.wu-wien.ac.at/unuran/doc/unuran.html#Methods_005ffor_005fCONT">UnuRan doc</A>. | ||
| A re-initialization is needed whenever distribution parameters have been changed. | ||
| Note that the method string can contain in addition to the method name all the specific method | ||
| parameters specified using the UNURAN methiod string API. |
math/unuran/inc/TUnuran.h
Outdated
|
|
||
| /** | ||
| Return an information string about used unuran generator method. | ||
| @param extended : if true return some help about the possible options for the the method. |
| double fMode; /// mode of dist (1D) | ||
| double fArea; /// area of dist | ||
| std::vector<double> fNDMode; /// mode of the multi-dim distribution | ||
| const ROOT::Math::IGenFunction * fFunc1D = nullptr; /// 1D function pointer (pdf) |
There was a problem hiding this comment.
@lmoneta I am confused with the 1D in fFunc1D and the comments. Isn't it used for 1 and more dimensions? (The same for fDPDF
There was a problem hiding this comment.
Those function objects will be used only for the 1D case. For the multi-dim case the function pointer stared in the base class DistSampler will be used
| int fLevel; /// debug level | ||
| double fMode; /// mode of dist (1D) | ||
| double fArea; /// area of dist | ||
| std::vector<double> fNDMode; /// mode of the multi-dim distribution |
There was a problem hiding this comment.
@lmoneta Are Samplers streamed? In principle it seems to be more elegant to have just the vector of modes which may have a single element if set from the 1D setter. But I guess it could break some samplers stored in files.
There was a problem hiding this comment.
@lmoneta Sorry I didn't realise that the base class allows to set multi dim functions, while the TUnuranSampler provides the inteface for 1D functions.
| ROOT::Fit::DataRange * fRange; // data range | ||
| const ROOT::Math::IMultiGenFunction * fFunc; // internal function (ND) | ||
| bool fOwnFunc; /// flag to indicate if the function is owned | ||
| mutable std::vector<double> fData; ///! internal array used to cached the sample data |
There was a problem hiding this comment.
Just for my understanding, what is the point of having the fData member marked as transient with /! when the DistSampler doesn't have a ClassDef? If there is no ClassDef, doesn't this mean that the class is not serializable and it doesn't matter if a member is transient or not?
There was a problem hiding this comment.
Ah I think I get it, it's because if you have a class deriving from it like the TUnuranSampler (which has a ClassDef introduced with this commit), it also needs to know which members of the base classes should be transient or not.
There was a problem hiding this comment.
@guitargeek a class can be streamed in ROOT also if it has not a ClassDef. ClassDef only defines the version of the class
| std::vector<std::string> names; | ||
| // start by named options | ||
| for (auto & e : fNamOpts) | ||
| names.push_back(e.first.c_str()); |
There was a problem hiding this comment.
std::vector<std::string> names;
names.reserve(fNamOpts);
// start by named options
for (auto const& e : fNamOpts)
names.emplace(e.first);std::stringcan be directly copy constructed from otherstd::stringwithout callingc_str()auto const&is preferred overauto &- we know how many elements the output vector we have, so
reservecan be used
Also the same the two other functions below.
There was a problem hiding this comment.
Thanks for the suggestion
math/unuran/test/unuranSampler.cxx
Outdated
| EXPECT_EQ(ret, true); | ||
| if (!ret) return; | ||
|
|
||
| sampler->SetMode(0.75); |
There was a problem hiding this comment.
I don't understand why the Mode is set to 0.75 here, should it not be 3.0 because the mean of a normal distribution is also the mean?
| opt.SetAlgorithm("tdr"); | ||
| // set specific options for the tdr method | ||
| opt.SetAlgoOption("c",0.); | ||
| opt.SetAlgoOption("cpoints",50); | ||
| opt.SetAlgoOption("usedars","true"); | ||
| opt.SetAlgoOption("variant_gw",""); | ||
| //opt.Print(); |
There was a problem hiding this comment.
Does this test check if the algorithm options were correctly propagated to TUnuran? Or is that not so important to test explicitly?
There was a problem hiding this comment.
Yes , this is the idea of this test. To check if one can pass the options. Real tests of the algorithms are provided in unuran itself and are outside of the scope of the wrapper classes
There was a problem hiding this comment.
Hi Lorenzo, thanks for the developments and in particular for implementing the nice test! I made a few comments, most are not explicit suggestions but questions.
And thanks @AntoniMarcinek for spotting the typos!
Before approving, there is one change I definitely want to request, which is squashing the 2nd and 3rd commit together. I think it's good that the 1st commit and the 2nd are separate because they are incremental developments, but the third commit is just a fixup to the second one and we don't need it in the commit history I think.
|
@guitargeek @lmoneta Do you have any expectation on when this would be merged? |
15c58f6 to
2c6245f
Compare
|
Starting build on |
- Add possibility to SetMode - Add possibility to Set DPDF and CDF Add additional improvements to use UNU.RAN using the DistSampler interface. This is to fix issue root-project#7332 The improvements are: - Improvements in using directly the string API from DistSampler - Better support for setting method and method options using DistSamplerOptions - Complete the support for setting dpdf, cdf and multi-sim mode in DistSampler - Add test program for all lthis new DistSampler functionality remove RMakeUnique since now minimumal C++ support is c++14
2c6245f to
e938269
Compare
|
Starting build on |
|
@AntoniMarcinek sorry for the delay. I think it it is fine we can then merge it soon. |
guitargeek
left a comment
There was a problem hiding this comment.
Thanks for addressing the review comments!
|
Build failed on mac11.0/cxx17. Failing tests: |
This Pull request adds several improvements to use UNU.RAN using the DistSampler interface
(through the TUnuranSampler class). This is to fix issue #7332 adding functionality as requested.
The major improvements are:
- Improvements in using directly the string API from
DistSampler- Better support for setting method and method options using the
DistSamplerOptionsclass- Support for setting dpdf, cdf and multi-sim mode in the DistSampler interface
- Add test program for all the new DistSampler functionality
This PR fixes #7332