Skip to content

[hist] Fix pattern-matching bug in TFormula::HandleLinear()#8801

Merged
guitargeek merged 2 commits intoroot-project:masterfrom
guitargeek:TFormula_HandleLinear_1
Aug 5, 2021
Merged

[hist] Fix pattern-matching bug in TFormula::HandleLinear()#8801
guitargeek merged 2 commits intoroot-project:masterfrom
guitargeek:TFormula_HandleLinear_1

Conversation

@guitargeek
Copy link
Copy Markdown
Contributor

Simple C++ code to demonstrate the problem:

TF1{"f1", "1.0++x++x*x++x*x*x", -1.0, 1.0}.Print();

Before this commit, running this line gave you:

Formula based function:     f1
        f1 : 1.0++x++x*x++x*x*x Ndim= 1, Npar= 3, Number= 0
 Formula expression:
        ([p0]*(1.0))+([p1]*(x))+([p2]*(x*x))+([p2]*(x*x))*x

The problem is that TFormula::HandleLinear() uses pattern-matching to
replace the strings representing the linear terms. This is problematic
if one of the strings is a substring of another one.

@guitargeek guitargeek added the bug label Aug 5, 2021
@guitargeek guitargeek self-assigned this Aug 5, 2021
@guitargeek guitargeek requested a review from lmoneta as a code owner August 5, 2021 00:38
@guitargeek guitargeek force-pushed the TFormula_HandleLinear_1 branch from 701948a to d033187 Compare August 5, 2021 01:42
Simple C++ code to demonstrate the problem:
```C++
TF1{"f1", "1.0++x++x*x++x*x*x", -1.0, 1.0}.Print();
```
Before this commit, running this line gave you:
```
Formula based function:     f1
        f1 : 1.0++x++x*x++x*x*x Ndim= 1, Npar= 3, Number= 0
 Formula expression:
        ([p0]*(1.0))+([p1]*(x))+([p2]*(x*x))+([p2]*(x*x))*x
```

The problem is that `TFormula::HandleLinear()` uses pattern-matching to
replace the strings representing the linear terms. This is problematic
if one of the strings is a substring of another one.
@guitargeek guitargeek force-pushed the TFormula_HandleLinear_1 branch from d033187 to 9bec4b5 Compare August 5, 2021 07:31
Copy link
Copy Markdown
Member

@lmoneta lmoneta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix Jonas !
Looks good, I have only a comment, it would be nice to have a new test showing the previous failure in test/TFormulaParsingTests.h

The formula is such that earlier terms are substrings of later terms,
triggering a bug that has recently been fixed.
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
@root-project root-project deleted a comment from phsft-bot Aug 5, 2021
Copy link
Copy Markdown
Member

@Axel-Naumann Axel-Naumann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but see commented. Thanks for fixing this "properly"!


namespace {

// Borrowed from RooHelpers.h, should this go to a central place?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should totally go into core/foundation/inc/ROOT/. Should this return vector<string> or vector<string_view>? The latter is dangerous if people do tokenize("abc cde",...) but I find that unlikely to happen.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! I don't have time to do this now, but I'll open a GitHub issue about it. Maybe we should have two versions: one returning vector<string> and one returning vector<string_view>.

namespace {

// Borrowed from RooHelpers.h, should this go to a central place?
std::vector<std::string> tokenise(const std::string &str, const std::string &delims, bool returnEmptyToken) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use US spelling, e.g. TString::Tokenize

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this should be changed then when the function is moved to core/foundation.

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2021-08-05T13:51:19.823Z] stderr: error: could not read '.git/rebase-apply/head-name': No such file or directory

Failing tests:

@guitargeek guitargeek merged commit 350f536 into root-project:master Aug 5, 2021
@guitargeek guitargeek deleted the TFormula_HandleLinear_1 branch August 5, 2021 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants