Skip to content

MOD-7795: Background indexing memory limitation & configuration #5778

Merged
lerman25 merged 90 commits intomasterfrom
OmerL_AddMemLimitToConfig
Apr 9, 2025
Merged

MOD-7795: Background indexing memory limitation & configuration #5778
lerman25 merged 90 commits intomasterfrom
OmerL_AddMemLimitToConfig

Conversation

@lerman25
Copy link
Collaborator

This PR adds memory consumption check during background indexing.

During the background key scan the current memory consumption is compared to the maxmemory setting,
If the memory consumption crosses the limit threshold (in percentage of the maxmemory) the background scan is stopped.

The threshold is defined by a new config parameter _INDEX_MEM_LIMIT (#4766 )
The parameter has default value of 80(%) and can range between 0 and 100, with 0 denoting no memory threshold.

If the background scan for an index has failed, all queries (other then FT.INFO & FT.DROP) should invoke an error.
This is checked with the current VERIFY_ACL macro or direct check of scan_failed_OOM, a new field for the spec object.

For verbosity, an Out Of Memory background scan error is shown in the following places:

  1. A new entry in FT.INFO Index Errors
  2. INFO metric
  3. Redis log

For testing, a new command is introduced for the BG_SCAN_CONTROLLER (#5672 ) - SET_PAUSE_ON_OOM,
which pauses the scan if an OOM has occurred.
And a new debug scanner status - PAUSED_ON_OOM.

@github-actions
Copy link

github-actions bot commented Apr 9, 2025

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

alonre24
alonre24 previously approved these changes Apr 9, 2025
@lerman25 lerman25 added this pull request to the merge queue Apr 9, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 9, 2025
@github-actions
Copy link

github-actions bot commented Apr 9, 2025

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

@lerman25 lerman25 added this pull request to the merge queue Apr 9, 2025
Merged via the queue into master with commit 8b63157 Apr 9, 2025
11 checks passed
@lerman25 lerman25 deleted the OmerL_AddMemLimitToConfig branch April 9, 2025 21:20
@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.0, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.0
git worktree add -d .worktree/backport-5778-to-8.0 origin/8.0
cd .worktree/backport-5778-to-8.0
git switch --create backport-5778-to-8.0
git cherry-pick -x 8b631577704f76e17e6a26cd81adaaf48b71f936

@lerman25
Copy link
Collaborator Author

/backport

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.0, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.0
git worktree add -d .worktree/backport-5778-to-8.0 origin/8.0
cd .worktree/backport-5778-to-8.0
git switch --create backport-5778-to-8.0
git cherry-pick -x 8b631577704f76e17e6a26cd81adaaf48b71f936

lerman25 added a commit that referenced this pull request Apr 10, 2025
* oom is 80% hard coded. basic shard log

* added error message in FT.INFO

* remove redundant additions

* treat maxmemry == 0 is invalid

* fixed PR comments

* fixed a typo

* indent

* draft

* Add warning to resp3

* Add background indexing failure to info metrics and ft.info

* deserilize error

* format

* correct config

* add config to config pytest

* add initial basic pytest

* Raise bg_index_error and temporal solution for % in log

* base verbosity test

* revert resp3 warning

* fix missing comma

* add pause on OOM mechanism

* Add test for pause on OOM mechanism

* New tests

* new test

* error on OOM index

* create new test file

* revert text_index_error to master

* Add OOM check for ALTER and test

* cleanup + format

* little cleanup

* more cleanup

* change % to percent for log display

* spellcheck fix

* typo fix

* fix pytests for index errors field

* spellcheck

* update index errors dict

* fix error change in config, fix pytest for index_errors

* revert test config

* update IndexError_Deserialize

* Leak possible fix

* Add terminate bg pool debug command

* expose reindex thread pool

* expose reindex thread pool

* less num_docs, add thread pool terminate

* fix pytest

* coord into account for ftinfo

* fix info for coord

* fix serialization process

* fix test

* fix flakeness

* Guy comments round 1

* new help for config

* format

* change config name

* remove Dvir's comment

* change error message + format

* format, guys comments, changing SET_BG_INDEX_RESUME to be without args, changing debug command syntax

* move scanner canceled into memory check blcok

* fix query error string

* change true/false to macro

* adhere to user data

* fix config text

* fix config

* fix config name, add comments on thread pool, change assert

* fix pytest

* Alon's comments round1

* remove SetIndexErrorMessage

* fail of ft.debug search/agg , add assertion instead of if

* change test

* Fail on hset after OOM, test for cluster

* fix assert, fix error message, change loose memory for tests,

* fix leakage

* fix "missing" error message

* support python < 3.9

* fix delete during indexing test, add warning and comment to GIL release

* make test more robust

---------

Co-authored-by: DvirDukhan <[email protected]>
(cherry picked from commit 8b63157)
github-merge-queue bot pushed a commit that referenced this pull request Apr 10, 2025
MOD-7795: Background indexing memory limitation & configuration  (#5778)

* oom is 80% hard coded. basic shard log

* added error message in FT.INFO

* remove redundant additions

* treat maxmemry == 0 is invalid

* fixed PR comments

* fixed a typo

* indent

* draft

* Add warning to resp3

* Add background indexing failure to info metrics and ft.info

* deserilize error

* format

* correct config

* add config to config pytest

* add initial basic pytest

* Raise bg_index_error and temporal solution for % in log

* base verbosity test

* revert resp3 warning

* fix missing comma

* add pause on OOM mechanism

* Add test for pause on OOM mechanism

* New tests

* new test

* error on OOM index

* create new test file

* revert text_index_error to master

* Add OOM check for ALTER and test

* cleanup + format

* little cleanup

* more cleanup

* change % to percent for log display

* spellcheck fix

* typo fix

* fix pytests for index errors field

* spellcheck

* update index errors dict

* fix error change in config, fix pytest for index_errors

* revert test config

* update IndexError_Deserialize

* Leak possible fix

* Add terminate bg pool debug command

* expose reindex thread pool

* expose reindex thread pool

* less num_docs, add thread pool terminate

* fix pytest

* coord into account for ftinfo

* fix info for coord

* fix serialization process

* fix test

* fix flakeness

* Guy comments round 1

* new help for config

* format

* change config name

* remove Dvir's comment

* change error message + format

* format, guys comments, changing SET_BG_INDEX_RESUME to be without args, changing debug command syntax

* move scanner canceled into memory check blcok

* fix query error string

* change true/false to macro

* adhere to user data

* fix config text

* fix config

* fix config name, add comments on thread pool, change assert

* fix pytest

* Alon's comments round1

* remove SetIndexErrorMessage

* fail of ft.debug search/agg , add assertion instead of if

* change test

* Fail on hset after OOM, test for cluster

* fix assert, fix error message, change loose memory for tests,

* fix leakage

* fix "missing" error message

* support python < 3.9

* fix delete during indexing test, add warning and comment to GIL release

* make test more robust

---------

Co-authored-by: DvirDukhan <[email protected]>
(cherry picked from commit 8b63157)
JoanFM pushed a commit that referenced this pull request May 27, 2025
* oom is 80% hard coded. basic shard log

* added error message in FT.INFO

* remove redundant additions

* treat maxmemry == 0 is invalid

* fixed PR comments

* fixed a typo

* indent

* draft

* Add warning to resp3

* Add background indexing failure to info metrics and ft.info

* deserilize error

* format

* correct config

* add config to config pytest

* add initial basic pytest

* Raise bg_index_error and temporal solution for % in log

* base verbosity test

* revert resp3 warning

* fix missing comma

* add pause on OOM mechanism

* Add test for pause on OOM mechanism

* New tests

* new test

* error on OOM index

* create new test file

* revert text_index_error to master

* Add OOM check for ALTER and test

* cleanup + format

* little cleanup

* more cleanup

* change % to percent for log display

* spellcheck fix

* typo fix

* fix pytests for index errors field

* spellcheck

* update index errors dict

* fix error change in config, fix pytest for index_errors

* revert test config

* update IndexError_Deserialize

* Leak possible fix

* Add terminate bg pool debug command

* expose reindex thread pool

* expose reindex thread pool

* less num_docs, add thread pool terminate

* fix pytest

* coord into account for ftinfo

* fix info for coord

* fix serialization process

* fix test

* fix flakeness

* Guy comments round 1

* new help for config

* format

* change config name

* remove Dvir's comment

* change error message + format

* format, guys comments, changing SET_BG_INDEX_RESUME to be without args, changing debug command syntax

* move scanner canceled into memory check blcok

* fix query error string

* change true/false to macro

* adhere to user data

* fix config text

* fix config

* fix config name, add comments on thread pool, change assert

* fix pytest

* Alon's comments round1

* remove SetIndexErrorMessage

* fail of ft.debug search/agg , add assertion instead of if

* change test

* Fail on hset after OOM, test for cluster

* fix assert, fix error message, change loose memory for tests,

* fix leakage

* fix "missing" error message

* support python < 3.9

* fix delete during indexing test, add warning and comment to GIL release

* make test more robust

---------

Co-authored-by: DvirDukhan <[email protected]>
JoanFM pushed a commit that referenced this pull request May 27, 2025
* oom is 80% hard coded. basic shard log

* added error message in FT.INFO

* remove redundant additions

* treat maxmemry == 0 is invalid

* fixed PR comments

* fixed a typo

* indent

* draft

* Add warning to resp3

* Add background indexing failure to info metrics and ft.info

* deserilize error

* format

* correct config

* add config to config pytest

* add initial basic pytest

* Raise bg_index_error and temporal solution for % in log

* base verbosity test

* revert resp3 warning

* fix missing comma

* add pause on OOM mechanism

* Add test for pause on OOM mechanism

* New tests

* new test

* error on OOM index

* create new test file

* revert text_index_error to master

* Add OOM check for ALTER and test

* cleanup + format

* little cleanup

* more cleanup

* change % to percent for log display

* spellcheck fix

* typo fix

* fix pytests for index errors field

* spellcheck

* update index errors dict

* fix error change in config, fix pytest for index_errors

* revert test config

* update IndexError_Deserialize

* Leak possible fix

* Add terminate bg pool debug command

* expose reindex thread pool

* expose reindex thread pool

* less num_docs, add thread pool terminate

* fix pytest

* coord into account for ftinfo

* fix info for coord

* fix serialization process

* fix test

* fix flakeness

* Guy comments round 1

* new help for config

* format

* change config name

* remove Dvir's comment

* change error message + format

* format, guys comments, changing SET_BG_INDEX_RESUME to be without args, changing debug command syntax

* move scanner canceled into memory check blcok

* fix query error string

* change true/false to macro

* adhere to user data

* fix config text

* fix config

* fix config name, add comments on thread pool, change assert

* fix pytest

* Alon's comments round1

* remove SetIndexErrorMessage

* fail of ft.debug search/agg , add assertion instead of if

* change test

* Fail on hset after OOM, test for cluster

* fix assert, fix error message, change loose memory for tests,

* fix leakage

* fix "missing" error message

* support python < 3.9

* fix delete during indexing test, add warning and comment to GIL release

* make test more robust

---------

Co-authored-by: DvirDukhan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants