RooFit::MultiProcess & TestStatistics part 5: RooGradMinimizerFcn#8596
Conversation
|
Starting build on |
|
Build failed on ROOT-performance-centos8-multicore/default. Warnings:
Failing tests: |
|
Build failed on ROOT-ubuntu16/nortcxxmod. Warnings:
Failing tests: |
|
Build failed on windows10/cxx14. Errors:
And 10 more |
|
Build failed on ROOT-debian10-i386/cxx14. Warnings:
Failing tests:
|
|
Build failed on mac11.0/cxx17. Failing tests: |
|
Build failed on mac1014/python3. Failing tests: |
hageboeck
left a comment
There was a problem hiding this comment.
Hi, I tried my best, but this is too much to review. All code that's longer than a few lines had to go unchecked.
I also saw merge commits, which ROOT doesn't use. A fresh rebase might be beneficial. That might also give an opportunity to kill the back-and-forth reverts that were a bit confusing to review.
I'm also not really sure if I hadn't seen some of those (or similar) commits. If that was desired, e.g. because this PR lives on top of another: OK
If that's not the intention, something is wrong.
One recommendation:
If you change whitespaces or other kinds of formatting or documentation, mark the commit with [NFC] Change formatting or [Docs] Fix docstrings of ..., and don't mix this with code changes. This makes the review a lot easier (especially because the reviewer can easily skip all things that you promise to be NFC).
| @@ -456,6 +456,7 @@ ROOT_STANDARD_LIBRARY_PACKAGE(RooFitCore | |||
| Matrix | |||
| Tree | |||
| Minuit | |||
There was a problem hiding this comment.
Is there a Minuit dependency? @guitargeek ?
By now, this might be behind interfaces.
6b129be to
1df6d18
Compare
|
Starting build on |
1df6d18 to
95b8c3b
Compare
|
Starting build on |
|
Build failed on ROOT-ubuntu16/nortcxxmod. Errors:
|
|
Build failed on ROOT-performance-centos8-multicore/default. Errors:
|
|
Build failed on ROOT-debian10-i386/cxx14. Errors:
|
|
Build failed on mac1014/python3. Errors:
|
|
Build failed on mac11.0/cxx17. Errors:
|
|
Build failed on windows10/cxx14. Errors:
|
95b8c3b to
692fae6
Compare
|
Starting build on |
|
Build failed on mac11.0/cxx17. Errors:
|
|
Build failed on ROOT-performance-centos8-multicore/default. Errors:
|
|
Build failed on windows10/cxx14. |
|
Starting build on |
|
Build failed on ROOT-performance-centos8-multicore/default. Failing tests: |
|
Build failed on ROOT-ubuntu16/nortcxxmod. Failing tests: |
|
Are those tests failing because of the changes in this PR or were they already failing? I see on my laptop (I don't see the failure messages in Jenkins) that RooFit stress tests give the following output: |
|
Build failed on ROOT-debian10-i386/cxx14. Errors:
Failing tests: |
|
Build failed on mac11.0/cxx17. Failing tests: |
|
Build failed on mac1014/python3. Failing tests: |
|
Starting build on |
|
Build failed on ROOT-debian10-i386/cxx14. Failing tests: |
|
Build successful! Ready to merge? |
guitargeek
left a comment
There was a problem hiding this comment.
Thanks for the PR and following up on the review comments! This PR introduced the RooGradMinimizerFcn, which is the second implementation of RooAbsMinimizerFcn after the good old RooMinimizerFcn. The unit test is disabled for now, but it should be enabled with the next PRs.
From comments by @hageboeck on PR #8596 (#8596): - Include reordering - Camel case member function names - Inline simple functions
From comments by @guitargeek and @hageboeck on PR #8596 (#8596): In RooMinimizer: - Removed template parameter from RooMinimizer ctor. - Remove dynamic_cast nullptr checks when FcnMode is already known. - Also replace static with dynamic_casts when downcasting; fixes clang tidy warning. - Fix compile warnings due to not returning anything at end of non-void function with switch statements. Other places: - In test_lib: remove Hex number printing class and reorder includes. - Renamed test to testRooGradMinimizerFcn(.cxx) and cleaned up old commented out code.
|
Can you please squash future PRs if there are commits such as "apply review comments" and "fix compilation errors after rebase"? In particular the latter makes |
|
Sorry! Indeed it might have been appropriate to squash and rebase here. |
Turns out the RooGradMinimizerFcn also still had a slight remaining bug, which was introduced by the "detemplatization" redesign of RooMinimizer in PR root-project#8596. The problem was that the RooMinimizer ctor, which calls the RooGradMinimizerFcn ctor, which calls synchronizeGradientParameterSettings... called _context->getMultiGenFcn() (_context is the same RooMinimizer that is creating the RooGradMinimizerFcn at this point...) and that in turn tries to do a dynamic_cast on _fcn, which is going to be the RooGradMinimizerFcn after it has been assigned after construction, but this is still in progress at that point, so the dynamic_cast fails, since _fcn is still nullptr. The fix was to remove a fortunately unused parameter in NumericalDerivator::SetInitialGradient, which removes the problematic _context->getMultiGenFcn() call.
Turns out the RooGradMinimizerFcn also still had a slight remaining bug, which was introduced by the "detemplatization" redesign of RooMinimizer in PR root-project#8596. The problem was that the RooMinimizer ctor, which calls the RooGradMinimizerFcn ctor, which calls synchronizeGradientParameterSettings... called _context->getMultiGenFcn() (_context is the same RooMinimizer that is creating the RooGradMinimizerFcn at this point...) and that in turn tries to do a dynamic_cast on _fcn, which is going to be the RooGradMinimizerFcn after it has been assigned after construction, but this is still in progress at that point, so the dynamic_cast fails, since _fcn is still nullptr. The fix was to remove a fortunately unused parameter in NumericalDerivator::SetInitialGradient, which removes the problematic _context->getMultiGenFcn() call.
Turns out the RooGradMinimizerFcn also still had a slight remaining bug, which was introduced by the "detemplatization" redesign of RooMinimizer in PR root-project#8596. The problem was that the RooMinimizer ctor, which calls the RooGradMinimizerFcn ctor, which calls synchronizeGradientParameterSettings... called _context->getMultiGenFcn() (_context is the same RooMinimizer that is creating the RooGradMinimizerFcn at this point...) and that in turn tries to do a dynamic_cast on _fcn, which is going to be the RooGradMinimizerFcn after it has been assigned after construction, but this is still in progress at that point, so the dynamic_cast fails, since _fcn is still nullptr. The fix was to remove a fortunately unused parameter in NumericalDerivator::SetInitialGradient, which removes the problematic _context->getMultiGenFcn() call. In addition, a compilation warning was produced by FunctorGradHandler, which shadowed the 5-parameter DoDerivative version. To silence the warning, we just add a five-parameter overload that forwards to the two-parameter one. To implement a full 5-parameter path in the Functor class hierarchy would require more work and discussion.
Turns out the RooGradMinimizerFcn also still had a slight remaining bug, which was introduced by the "detemplatization" redesign of RooMinimizer in PR root-project#8596. The problem was that the RooMinimizer ctor, which calls the RooGradMinimizerFcn ctor, which calls synchronizeGradientParameterSettings... called _context->getMultiGenFcn() (_context is the same RooMinimizer that is creating the RooGradMinimizerFcn at this point...) and that in turn tries to do a dynamic_cast on _fcn, which is going to be the RooGradMinimizerFcn after it has been assigned after construction, but this is still in progress at that point, so the dynamic_cast fails, since _fcn is still nullptr. The fix was to remove a fortunately unused parameter in NumericalDerivator::SetInitialGradient, which removes the problematic _context->getMultiGenFcn() call. In addition, compilation warnings were produced by overloads that shadowed the 5-parameter DoDerivative/Gradient overloads. To silence the warnings, we just add five-parameter overloads that forward to the two-parameter ones. To implement full 5-parameter paths everywhere would require more work and discussion.
Turns out the RooGradMinimizerFcn also still had a slight remaining bug, which was introduced by the "detemplatization" redesign of RooMinimizer in PR root-project#8596. The problem was that the RooMinimizer ctor, which calls the RooGradMinimizerFcn ctor, which calls synchronizeGradientParameterSettings... called _context->getMultiGenFcn() (_context is the same RooMinimizer that is creating the RooGradMinimizerFcn at this point...) and that in turn tries to do a dynamic_cast on _fcn, which is going to be the RooGradMinimizerFcn after it has been assigned after construction, but this is still in progress at that point, so the dynamic_cast fails, since _fcn is still nullptr. The fix was to remove a fortunately unused parameter in NumericalDerivator::SetInitialGradient, which removes the problematic _context->getMultiGenFcn() call. In addition, compilation warnings were produced by overloads that shadowed the 5-parameter DoDerivative/Gradient overloads. To silence the warnings, we just add five-parameter overloads that forward to the two-parameter ones. To implement full 5-parameter paths everywhere would require more work and discussion. Also some compilation errors on Windows fixed by using vectors instead of non-standard dynamic sized arrays.
Turns out the RooGradMinimizerFcn also still had a slight remaining bug, which was introduced by the "detemplatization" redesign of RooMinimizer in PR root-project#8596. The problem was that the RooMinimizer ctor, which calls the RooGradMinimizerFcn ctor, which calls synchronizeGradientParameterSettings... called _context->getMultiGenFcn() (_context is the same RooMinimizer that is creating the RooGradMinimizerFcn at this point...) and that in turn tries to do a dynamic_cast on _fcn, which is going to be the RooGradMinimizerFcn after it has been assigned after construction, but this is still in progress at that point, so the dynamic_cast fails, since _fcn is still nullptr. The fix was to remove a fortunately unused parameter in NumericalDerivator::SetInitialGradient, which removes the problematic _context->getMultiGenFcn() call. In addition, compilation warnings were produced by overloads that shadowed the 5-parameter DoDerivative/Gradient overloads. To silence the warnings, we just add five-parameter overloads that forward to the two-parameter ones. To implement full 5-parameter paths everywhere would require more work and discussion. Also some compilation errors on Windows fixed by using vectors instead of non-standard dynamic sized arrays.
From comments by @hageboeck on PR root-project#8596 (root-project#8596): - Include reordering - Camel case member function names - Inline simple functions
From comments by @guitargeek and @hageboeck on PR root-project#8596 (root-project#8596): In RooMinimizer: - Removed template parameter from RooMinimizer ctor. - Remove dynamic_cast nullptr checks when FcnMode is already known. - Also replace static with dynamic_casts when downcasting; fixes clang tidy warning. - Fix compile warnings due to not returning anything at end of non-void function with switch statements. Other places: - In test_lib: remove Hex number printing class and reorder includes. - Renamed test to testRooGradMinimizerFcn(.cxx) and cleaned up old commented out code.
Turns out the RooGradMinimizerFcn also still had a slight remaining bug, which was introduced by the "detemplatization" redesign of RooMinimizer in PR #8596. The problem was that the RooMinimizer ctor, which calls the RooGradMinimizerFcn ctor, which calls synchronizeGradientParameterSettings... called _context->getMultiGenFcn() (_context is the same RooMinimizer that is creating the RooGradMinimizerFcn at this point...) and that in turn tries to do a dynamic_cast on _fcn, which is going to be the RooGradMinimizerFcn after it has been assigned after construction, but this is still in progress at that point, so the dynamic_cast fails, since _fcn is still nullptr. The fix was to remove a fortunately unused parameter in NumericalDerivator::SetInitialGradient, which removes the problematic _context->getMultiGenFcn() call. In addition, compilation warnings were produced by overloads that shadowed the 5-parameter DoDerivative/Gradient overloads. To silence the warnings, we just add five-parameter overloads that forward to the two-parameter ones. To implement full 5-parameter paths everywhere would require more work and discussion. Also some compilation errors on Windows fixed by using vectors instead of non-standard dynamic sized arrays.
This PR adds
RooGradMinimizerFcn, an alternative toRooMinimizerFcnthat calculates gradient itself outside of Minuit during minimization.To be able to use
RooGradMinimizerFcn,RooMinimizerhad to be refactored to be able to useRooAbsMinimizerFcninstead of the concrete RooMinimizerFcn class. To accomodate this change, several other changes had to be made:_optConstand_funcmembers were removed, since these are now managed by the*MinimizerFcn.RooAbsMinimizerFcnas well.getMultiGenFcn,fitterFcnandfitFcnwere added which are used in several places as convenience functions to access the concreteRooAbsMinimizerFcnobject with its proper type. The concrete type matters, for instance, when passing the class to the derivator, because a gradient enabled*MinimizerFcnneeds to take different overloads than the classic function-value-onlyRooMinimizerFcn. AFcnModeenum class was added for this disambiguation as well.RooMinimizercan be done the same as always when using a classicRooMinimizerFcn. However, to activateRooGradMinimizerFcn, a new create function can be used with the proper type as template parameter. In a later PR, this pattern will be extended with another gradient-enabledRooAbsMinimizerFcninstantiation that will also include multi-processing implementations.Since we had to refactor construction, we took the opportunity to also put default parameters in the header declaration, instead of in the constructor initializer lists.
Finally, this PR changes the default minimizer from Minuit to Minuit2. Note that this may require some discussion. We already briefly touched upon this in a Zoom meeting, but haven't thoroughly mapped out consequences yet. The reason for the switch is that the derivator scheme used in
RooGradMinimizerFcnreplicates that of Minuit2, so comparing to that makes more sense. Also, it just hasn't been tested at all with Minuit 1, I'm not even sure it will work with that version.Other than these
RooMinimizerchanges and the addition ofRooGradMinimizerFcn, a new test was added calledtestRooGradMinimizer. It also uses atest_lib.hheader that will also be used later on in tests of other new likelihood minimization implementations. Do note that the test currently fails. I originally had carefully made sure the results were bit-wise exactly the same as those from using classicRooMinimizerFcn, but there must have been some change in the meantime that I have not been able to integrate yet. I suspect it is caused by a change in Kahan summation, but have to inspect. In any case, the results are minor differences that do not negatively affect the effectiveness, the minimization just takes slightly different paths to the end results that will agree within the desired precision when set high enough.Final note: this PR depends on three others: #8369, #8567 and #8569. These branches as they currently are were merged into this branch before making the one commit that adds everything in this actual commit. Do not merge this one before those. Also take care that this branch will be rebased and force pushed after each of those three are merged. In short: probably it will be easier to review this branch after those others are merged. However, of course, a review is already most welcome. Just make sure to only look at the diff from the final commit: 6b129beUpdate: two of the above dependency PRs were merged. Only #8567 still needs to be merged and hence is also still included in this PR to make it compile. I have marked these three
math/minuit2files with a review comment. All the other files actually do belong to this PR. So, in short: ready for review!Checklist: