Fix TestBooleanMinShouldMatch#testRandomQueries failure. #14715

jpountz · 2025-05-26T12:32:04Z

This test generates random boolean queries and ensures that setting a minimum number of matching SHOULD clauses returns a subset of the hits with the same scores.

It already tries to work around accuracy loss due to arithmetic operations by allowing a delta of up to one ulp between these two queries. However, sometimes the delta can be higher.

For instance consider the following query that triggered the most recent test failure: (data:5 data:5 data:5 data:6 +data:6 data:Z data:X -data:1)~2. Without a minimum number of matching SHOULD clauses, it gets rewritten to (data:5^3 +data:6^2 data:Z data:X -data:1). So the score contribution of data:5 is computed as (double) score(data:5) + (double) score(data:5) + (double) score(data:5) in one case, and (double) (score(data:5: * 3f) (multiply first, then cast to a double) in the other case. The use of ReqOptSumScorer also contributes accuracy losses as per existing comment, for instance data:6 is part of both the required and the optional clauses in the first case, while it's only a required clauses (with a 2x boost) in the other case. So accuracy loss accrues differently.

I don't think we should try too hard to avoid these accuracy losses, so I'm instead increasing the leniency of the test.

This test generates random boolean queries and ensures that setting a minimum number of matching SHOULD clauses returns a subset of the hits with the same scores. It already tries to work around accuracy loss due to arithmetic operations by allowing a delta of up to one ulp between these two queries. However, sometimes the delta can be higher. For instance consider the following query that triggered the most recent test failure: `(data:5 data:5 data:5 data:6 +data:6 data:Z data:X -data:1)~2`. Without a minimum number of matching SHOULD clauses, it gets rewritten to `(data:5^3 +data:6^2 data:Z data:X -data:1)`. So the score contribution of `data:5` is computed as `(double) score(data:5) + (double) score(data:5) + (double) score(data:5)` in one case, and `(double) (score(data:5: * 3f)` (multiply first, then cast to a double) in the other case. The use of `ReqOptSumScorer` also contributes accuracy losses as per existing comment, for instance `data:6` is part of both the required and the optional clauses in the first case, while it's only a required clauses (with a 2x boost) in the other case. So accuracy loss accrues differently. I don't think we should try too hard to avoid these accuracy losses, so I'm instead increasing the leniency of the test.

github-actions · 2025-05-26T12:32:58Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

This test generates random boolean queries and ensures that setting a minimum number of matching SHOULD clauses returns a subset of the hits with the same scores. It already tries to work around accuracy loss due to arithmetic operations by allowing a delta of up to one ulp between these two queries. However, sometimes the delta can be higher. For instance consider the following query that triggered the most recent test failure: `(data:5 data:5 data:5 data:6 +data:6 data:Z data:X -data:1)~2`. Without a minimum number of matching SHOULD clauses, it gets rewritten to `(data:5^3 +data:6^2 data:Z data:X -data:1)`. So the score contribution of `data:5` is computed as `(double) score(data:5) + (double) score(data:5) + (double) score(data:5)` in one case, and `(double) (score(data:5: * 3f)` (multiply first, then cast to a double) in the other case. The use of `ReqOptSumScorer` also contributes accuracy losses as per existing comment, for instance `data:6` is part of both the required and the optional clauses in the first case, while it's only a required clauses (with a 2x boost) in the other case. So accuracy loss accrues differently. I don't think we should try too hard to avoid these accuracy losses, so I'm instead increasing the leniency of the test.

jpountz added this to the 10.3.0 milestone May 26, 2025

jpountz added the type:test label May 26, 2025

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking May 26, 2025

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking May 26, 2025

github-actions bot added the module:core/search label May 26, 2025

jpountz mentioned this pull request May 26, 2025

Better vectorize score computations. #14704

Merged

gf2121 approved these changes May 26, 2025

View reviewed changes

uschindler approved these changes May 30, 2025

View reviewed changes

jpountz merged commit 50b4363 into apache:main May 31, 2025
7 checks passed

github-project-automation bot moved this from Open to Merged in OpenSearch Lucene & Core Performance Tracking May 31, 2025

jpountz deleted the fix_TestBooleanMinShouldMatch_failure branch May 31, 2025 06:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix TestBooleanMinShouldMatch#testRandomQueries failure. #14715

Fix TestBooleanMinShouldMatch#testRandomQueries failure. #14715

Uh oh!

jpountz commented May 26, 2025

Uh oh!

github-actions bot commented May 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix TestBooleanMinShouldMatch#testRandomQueries failure. #14715

Fix TestBooleanMinShouldMatch#testRandomQueries failure. #14715

Uh oh!

Conversation

jpountz commented May 26, 2025

Uh oh!

github-actions bot commented May 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants