Skip to content

MOD-6786 Fix search on larger then 128 terms#5524

Merged
lerman25 merged 9 commits intomasterfrom
OmerL_TermLimitFix
Jan 22, 2025
Merged

MOD-6786 Fix search on larger then 128 terms#5524
lerman25 merged 9 commits intomasterfrom
OmerL_TermLimitFix

Conversation

@lerman25
Copy link
Collaborator

@lerman25 lerman25 commented Jan 19, 2025

Fix bug, terms in text longer then 128 (MAX_NORMALIZE_SIZE) cannot be found due to normalization inconsistency.
In indexing, the term was normalized only up to character 128, in search the entire query is normalized.
The limitation of the string normalization was moved to inside NO MODIFY flow, so now all of the term indexed is normalized.
Testes added.

@codecov
Copy link

codecov bot commented Jan 19, 2025

Codecov Report

Attention: Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 87.19%. Comparing base (fdd48ae) to head (f3fdb84).
Report is 253 commits behind head on master.

Files with missing lines Patch % Lines
src/tokenize.c 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5524      +/-   ##
==========================================
- Coverage   87.20%   87.19%   -0.01%     
==========================================
  Files         196      196              
  Lines       35226    35226              
==========================================
- Hits        30720    30717       -3     
- Misses       4506     4509       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lerman25 lerman25 requested review from alonre24 and nafraf January 20, 2025 11:35
alonre24
alonre24 previously approved these changes Jan 20, 2025
nafraf
nafraf previously approved these changes Jan 22, 2025
Copy link
Collaborator

@nafraf nafraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Approved. Maybe we'll need another PR to remove the dead code.

@lerman25 lerman25 dismissed stale reviews from nafraf and alonre24 via 6929c74 January 22, 2025 14:21
@lerman25 lerman25 requested review from alonre24 and nafraf January 22, 2025 14:24
nafraf
nafraf previously approved these changes Jan 22, 2025
github-merge-queue bot pushed a commit that referenced this pull request Jan 23, 2025
MOD-6786 Fix search on larger then 128 terms (#5524)

* Move length slicing to NOMODIFY if

* add py test

* fix slicing

* fix test

* fix text skip cluster

* Adding comments

* Update test_issues - skip cluster

(cherry picked from commit efda03d)

Co-authored-by: lerman25 <[email protected]>
dor-forer pushed a commit that referenced this pull request Jan 26, 2025
* Move length slicing to NOMODIFY if

* add py test

* fix slicing

* fix test

* fix text skip cluster

* Adding comments

* Update test_issues - skip cluster
github-merge-queue bot pushed a commit that referenced this pull request Jan 30, 2025
* Adding numeric check

* changes

* change to each one

* MOD-6786 Fix search on larger then 128 terms (#5524)

* Move length slicing to NOMODIFY if

* add py test

* fix slicing

* fix test

* fix text skip cluster

* Adding comments

* Update test_issues - skip cluster

* MOD-8561: Fix Inverted Index SeekTo Edge Case (#5528)

* * initial commit

* * simplify the fix

* * revert to old code to solve edge case

* Load config params for Redis 8.0-m03 (#5538)

* Load config params for Redis v7.9.226

* Add step to get latest unreleased redis tag

* Remove commented-out step `Get Latest Release Tag with Prefix`

* Revert: task-get-latest-tag.yml

* MOD-8601: Fix error message for LOAD (#5531)

* Enhance error message for LOAD

* Fix error message

* Address review

* Fix flakiness in a test (#5541)

* fix flakiness

* revert whitespace change

* Fix Max Frequency Misscalculation - [MOD-8158] (#5553)

* fix unrelated test

* add a failing test

* fix issue

* revert whitespace change from test_vecsim.py

* revert whitespace change in test_issues.py

* Fix APPLY/FILTER parser - [MOD-7804] (#5520)

* fix order of operations

* minor improvements to the lexer

* improve functions parsing

* optimize "NOT" logic and perform arithmetic operations immediately

* fix flow tests

* fix grammar optimization

* changed function API

* added a test

* minor fix

* fix precedence

* minor improvement

* reorder rule and fix leak

* another fix

* added test

* more tests for a better coverage

* improved test

* fix assertion

* review fixes

* address code review

* added comments

* remove unncessary

* Added tests for legacy filter empty

* Adding numeric check

* changes

* change to each one

* remove unncessary

* Added tests for legacy filter empty

* * Change the order of params
* Add support in GEOFILTER

* Forgot one file

* * Changed to AC_GetString with no advance
* Added comment
* change the string check

* PR changes

* Changes

* push the test

* change style

---------

Co-authored-by: lerman25 <[email protected]>
Co-authored-by: kei-nan <[email protected]>
Co-authored-by: nafraf <[email protected]>
Co-authored-by: Raz Monsonego <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
redisearch-backport-pull-request bot pushed a commit that referenced this pull request Jan 30, 2025
* Adding numeric check

* changes

* change to each one

* MOD-6786 Fix search on larger then 128 terms (#5524)

* Move length slicing to NOMODIFY if

* add py test

* fix slicing

* fix test

* fix text skip cluster

* Adding comments

* Update test_issues - skip cluster

* MOD-8561: Fix Inverted Index SeekTo Edge Case (#5528)

* * initial commit

* * simplify the fix

* * revert to old code to solve edge case

* Load config params for Redis 8.0-m03 (#5538)

* Load config params for Redis v7.9.226

* Add step to get latest unreleased redis tag

* Remove commented-out step `Get Latest Release Tag with Prefix`

* Revert: task-get-latest-tag.yml

* MOD-8601: Fix error message for LOAD (#5531)

* Enhance error message for LOAD

* Fix error message

* Address review

* Fix flakiness in a test (#5541)

* fix flakiness

* revert whitespace change

* Fix Max Frequency Misscalculation - [MOD-8158] (#5553)

* fix unrelated test

* add a failing test

* fix issue

* revert whitespace change from test_vecsim.py

* revert whitespace change in test_issues.py

* Fix APPLY/FILTER parser - [MOD-7804] (#5520)

* fix order of operations

* minor improvements to the lexer

* improve functions parsing

* optimize "NOT" logic and perform arithmetic operations immediately

* fix flow tests

* fix grammar optimization

* changed function API

* added a test

* minor fix

* fix precedence

* minor improvement

* reorder rule and fix leak

* another fix

* added test

* more tests for a better coverage

* improved test

* fix assertion

* review fixes

* address code review

* added comments

* remove unncessary

* Added tests for legacy filter empty

* Adding numeric check

* changes

* change to each one

* remove unncessary

* Added tests for legacy filter empty

* * Change the order of params
* Add support in GEOFILTER

* Forgot one file

* * Changed to AC_GetString with no advance
* Added comment
* change the string check

* PR changes

* Changes

* push the test

* change style

---------

Co-authored-by: lerman25 <[email protected]>
Co-authored-by: kei-nan <[email protected]>
Co-authored-by: nafraf <[email protected]>
Co-authored-by: Raz Monsonego <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit e4d8fa0)
github-merge-queue bot pushed a commit that referenced this pull request Feb 4, 2025
* Fix Empty Numeric Value - [MOD-7244] (#5566)

* Adding numeric check

* changes

* change to each one

* MOD-6786 Fix search on larger then 128 terms (#5524)

* Move length slicing to NOMODIFY if

* add py test

* fix slicing

* fix test

* fix text skip cluster

* Adding comments

* Update test_issues - skip cluster

* MOD-8561: Fix Inverted Index SeekTo Edge Case (#5528)

* * initial commit

* * simplify the fix

* * revert to old code to solve edge case

* Load config params for Redis 8.0-m03 (#5538)

* Load config params for Redis v7.9.226

* Add step to get latest unreleased redis tag

* Remove commented-out step `Get Latest Release Tag with Prefix`

* Revert: task-get-latest-tag.yml

* MOD-8601: Fix error message for LOAD (#5531)

* Enhance error message for LOAD

* Fix error message

* Address review

* Fix flakiness in a test (#5541)

* fix flakiness

* revert whitespace change

* Fix Max Frequency Misscalculation - [MOD-8158] (#5553)

* fix unrelated test

* add a failing test

* fix issue

* revert whitespace change from test_vecsim.py

* revert whitespace change in test_issues.py

* Fix APPLY/FILTER parser - [MOD-7804] (#5520)

* fix order of operations

* minor improvements to the lexer

* improve functions parsing

* optimize "NOT" logic and perform arithmetic operations immediately

* fix flow tests

* fix grammar optimization

* changed function API

* added a test

* minor fix

* fix precedence

* minor improvement

* reorder rule and fix leak

* another fix

* added test

* more tests for a better coverage

* improved test

* fix assertion

* review fixes

* address code review

* added comments

* remove unncessary

* Added tests for legacy filter empty

* Adding numeric check

* changes

* change to each one

* remove unncessary

* Added tests for legacy filter empty

* * Change the order of params
* Add support in GEOFILTER

* Forgot one file

* * Changed to AC_GetString with no advance
* Added comment
* change the string check

* PR changes

* Changes

* push the test

* change style

---------

Co-authored-by: lerman25 <[email protected]>
Co-authored-by: kei-nan <[email protected]>
Co-authored-by: nafraf <[email protected]>
Co-authored-by: Raz Monsonego <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit e4d8fa0)

* Change python to fit python3.7

---------

Co-authored-by: dor-forer <[email protected]>
@nafraf
Copy link
Collaborator

nafraf commented Jun 6, 2025

/backport

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.10, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.10
git worktree add -d .worktree/backport-5524-to-2.10 origin/2.10
cd .worktree/backport-5524-to-2.10
git switch --create backport-5524-to-2.10
git cherry-pick -x efda03de8780f3858810a3312673f84f671e4214

nafraf pushed a commit that referenced this pull request Jun 6, 2025
github-merge-queue bot pushed a commit that referenced this pull request Jun 6, 2025
MOD-6786 Fix search on larger then 128 terms (#5524)
(cherry picked from commit efda03d)

Co-authored-by: lerman25 <[email protected]>
@nafraf
Copy link
Collaborator

nafraf commented Jun 18, 2025

/backport

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.8, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.8
git worktree add -d .worktree/backport-5524-to-2.8 origin/2.8
cd .worktree/backport-5524-to-2.8
git switch --create backport-5524-to-2.8
git cherry-pick -x efda03de8780f3858810a3312673f84f671e4214

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.6, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.6
git worktree add -d .worktree/backport-5524-to-2.6 origin/2.6
cd .worktree/backport-5524-to-2.6
git switch --create backport-5524-to-2.6
git cherry-pick -x efda03de8780f3858810a3312673f84f671e4214

nafraf pushed a commit that referenced this pull request Jun 18, 2025
nafraf pushed a commit that referenced this pull request Jun 18, 2025
github-merge-queue bot pushed a commit that referenced this pull request Jun 19, 2025
* MOD-6786 Fix search on larger then 128 terms (#5524)

(cherry picked from commit efda03d)

* Update testLongTerms(env) because now the term is not truncated

(cherry picked from commit c5855de)

---------

Co-authored-by: lerman25 <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Jun 19, 2025
* MOD-6786 Fix search on larger then 128 terms (#5524)

(cherry picked from commit efda03d)

* Update testLongTerms(env) because now the term is not truncated

---------

Co-authored-by: lerman25 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants