Skip to content

[MOD-9560] Revert marking index as faulty after reaching OOM#6140

Merged
lerman25 merged 66 commits intomasterfrom
omerL-revert-oom-behavior
May 22, 2025
Merged

[MOD-9560] Revert marking index as faulty after reaching OOM#6140
lerman25 merged 66 commits intomasterfrom
omerL-revert-oom-behavior

Conversation

@lerman25
Copy link
Collaborator

@lerman25 lerman25 commented May 15, 2025

This PR changes the following:

  • When a background scan for Index fails due to OOM, queries on that index will not longer return an error but a partial result, based on what was indexed during the scan.
  • Resp3 warning for each query.

Continuing the work of #6114 , #6053 , #5778 .

@github-actions github-actions bot added size:M and removed size:L labels May 20, 2025
@lerman25 lerman25 marked this pull request as ready for review May 20, 2025 09:03
@lerman25 lerman25 marked this pull request as draft May 20, 2025 09:42
@lerman25 lerman25 marked this pull request as ready for review May 20, 2025 09:42
Copy link
Collaborator

@alonre24 alonre24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

X(QUERY_EALIASCONFLICT, "Alias conflicts with an existing index name") \
X(QUERY_INDEXBGOOMFAIL, "Index background scan failed due to OOM. Queries cannot be executed on\
an incomplete index.") \
X(QUERY_INDEXBGOOMFAIL, "Index background scan failed due to OOM") \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
X(QUERY_INDEXBGOOMFAIL, "Index background scan failed due to OOM") \
X(QUERY_INDEXBGOOMFAIL, "Index background scan did not complete due to OOM") \

@@ -259,56 +265,6 @@ def test_change_config_during_bg_indexing(env):
memory_ratio = get_memory_consumption_ratio(env)
env.assertAlmostEqual(memory_ratio, 0.85, delta=0.1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we validate that the resp3 warning is also in ft.profile output under "warnings" section (should be available for resp2 and resp AFAIR)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added warning to resp2, validated in verbosity test for both resp3 & 2

RedisModule_Reply_LongLong(reply, 0);
if (IsProfile(req)) {
req->profile(reply, req, has_timedout, req->qiter.err->reachedMaxPrefixExpansions);
req->profile(reply, req, has_timedout, req->qiter.err->reachedMaxPrefixExpansions, req->sctx->spec && req->sctx->spec->scan_failed_OOM);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not actually related to this PR, but can we just create ProfilePrinterCtx and move it here by value? The list of arguments to this function is getting long...

@lerman25 lerman25 enabled auto-merge May 22, 2025 10:36
@lerman25 lerman25 added this pull request to the merge queue May 22, 2025
Merged via the queue into master with commit 26d7d10 May 22, 2025
14 checks passed
@lerman25 lerman25 deleted the omerL-revert-oom-behavior branch May 22, 2025 11:56
JoanFM pushed a commit that referenced this pull request May 27, 2025
* Add config

* insert oom_scan field to scanner, remove check for oom in debug pause on oom

* fix comment

* create basis for function

* handle last scanned key

* change sleep function name

* Add pause before and after reset

* Add pause before and after statuses code

* Add pause before and after reset to bg scan

* wait if config >0

* adjust default sleep time

* fix last scanned key

* temp tests

* bsaic test strcuture

* styling

* Add scanner status strings

* Add first test

* Add update option for dbug scanner and more tests

* remove scanner cancleation

* Alter and Drop tests

* spellcheck

* skip cluster in tests

* Add config test for new config

* shortern tests

* styling

* remove pause on OOM from drop test

* Add tests, style

* style

* style

* Remove unused pause after OOM reset

* debug commands tests and cluster tests

* Naming, styling, formatting

* rename, change structure for simplicity

* improve test robustness

* remove unused and move to better location

* remove OOM from ACL

* remove oom from aggregate_exec

* Change config help message

* remove from ft.search

* remove query error test

* Add warning to query_error

* Add warning to resp3

* Add resp3 test

* format

* revert pytest

* resp3 tests

* Ben comments round1

* Add 0 thresh test

* comment

* ADd skip cluster

* change config and other Alon's comments

* small tess

* remove pause after
Remove duplicates in tests

* more test compression

* fix assert

* FT.PROFILE and OOM string

* Add resp2 warning

* Revert "Add resp2 warning"

This reverts commit 6b73b3c.

* Add resp2 warning

* format
JoanFM pushed a commit that referenced this pull request May 27, 2025
* Add config

* insert oom_scan field to scanner, remove check for oom in debug pause on oom

* fix comment

* create basis for function

* handle last scanned key

* change sleep function name

* Add pause before and after reset

* Add pause before and after statuses code

* Add pause before and after reset to bg scan

* wait if config >0

* adjust default sleep time

* fix last scanned key

* temp tests

* bsaic test strcuture

* styling

* Add scanner status strings

* Add first test

* Add update option for dbug scanner and more tests

* remove scanner cancleation

* Alter and Drop tests

* spellcheck

* skip cluster in tests

* Add config test for new config

* shortern tests

* styling

* remove pause on OOM from drop test

* Add tests, style

* style

* style

* Remove unused pause after OOM reset

* debug commands tests and cluster tests

* Naming, styling, formatting

* rename, change structure for simplicity

* improve test robustness

* remove unused and move to better location

* remove OOM from ACL

* remove oom from aggregate_exec

* Change config help message

* remove from ft.search

* remove query error test

* Add warning to query_error

* Add warning to resp3

* Add resp3 test

* format

* revert pytest

* resp3 tests

* Ben comments round1

* Add 0 thresh test

* comment

* ADd skip cluster

* change config and other Alon's comments

* small tess

* remove pause after
Remove duplicates in tests

* more test compression

* fix assert

* FT.PROFILE and OOM string

* Add resp2 warning

* Revert "Add resp2 warning"

This reverts commit 6b73b3c.

* Add resp2 warning

* format
lerman25 added a commit that referenced this pull request May 27, 2025
* Add config

* insert oom_scan field to scanner, remove check for oom in debug pause on oom

* fix comment

* create basis for function

* handle last scanned key

* change sleep function name

* Add pause before and after reset

* Add pause before and after statuses code

* Add pause before and after reset to bg scan

* wait if config >0

* adjust default sleep time

* fix last scanned key

* temp tests

* bsaic test strcuture

* styling

* Add scanner status strings

* Add first test

* Add update option for dbug scanner and more tests

* remove scanner cancleation

* Alter and Drop tests

* spellcheck

* skip cluster in tests

* Add config test for new config

* shortern tests

* styling

* remove pause on OOM from drop test

* Add tests, style

* style

* style

* Remove unused pause after OOM reset

* debug commands tests and cluster tests

* Naming, styling, formatting

* rename, change structure for simplicity

* improve test robustness

* remove unused and move to better location

* remove OOM from ACL

* remove oom from aggregate_exec

* Change config help message

* remove from ft.search

* remove query error test

* Add warning to query_error

* Add warning to resp3

* Add resp3 test

* format

* revert pytest

* resp3 tests

* Ben comments round1

* Add 0 thresh test

* comment

* ADd skip cluster

* change config and other Alon's comments

* small tess

* remove pause after
Remove duplicates in tests

* more test compression

* fix assert

* FT.PROFILE and OOM string

* Add resp2 warning

* Revert "Add resp2 warning"

This reverts commit 6b73b3c.

* Add resp2 warning

* format
github-merge-queue bot pushed a commit that referenced this pull request Jun 8, 2025
* [MOD-9372 , MOD-9733] Stop indexing OOM - Add wait before OOM (#6114)

* Add config

* insert oom_scan field to scanner, remove check for oom in debug pause on oom

* fix comment

* create basis for function

* handle last scanned key

* change sleep function name

* Add pause before and after reset

* Add pause before and after statuses code

* Add pause before and after reset to bg scan

* wait if config >0

* adjust default sleep time

* fix last scanned key

* temp tests

* bsaic test strcuture

* styling

* Add scanner status strings

* Add first test

* Add update option for dbug scanner and more tests

* remove scanner cancleation

* Alter and Drop tests

* spellcheck

* skip cluster in tests

* Add config test for new config

* shortern tests

* styling

* remove pause on OOM from drop test

* Add tests, style

* style

* style

* Remove unused pause after OOM reset

* debug commands tests and cluster tests

* Naming, styling, formatting

* rename, change structure for simplicity

* improve test robustness

* remove unused and move to better location

* Ben comments round1

* Add 0 thresh test

* comment

* ADd skip cluster

* change config and other Alon's comments

* small tess

* remove pause after
Remove duplicates in tests

* more test compression

* fix assert

* [MOD-9560]  Revert marking index as faulty after reaching OOM (#6140)

* Add config

* insert oom_scan field to scanner, remove check for oom in debug pause on oom

* fix comment

* create basis for function

* handle last scanned key

* change sleep function name

* Add pause before and after reset

* Add pause before and after statuses code

* Add pause before and after reset to bg scan

* wait if config >0

* adjust default sleep time

* fix last scanned key

* temp tests

* bsaic test strcuture

* styling

* Add scanner status strings

* Add first test

* Add update option for dbug scanner and more tests

* remove scanner cancleation

* Alter and Drop tests

* spellcheck

* skip cluster in tests

* Add config test for new config

* shortern tests

* styling

* remove pause on OOM from drop test

* Add tests, style

* style

* style

* Remove unused pause after OOM reset

* debug commands tests and cluster tests

* Naming, styling, formatting

* rename, change structure for simplicity

* improve test robustness

* remove unused and move to better location

* remove OOM from ACL

* remove oom from aggregate_exec

* Change config help message

* remove from ft.search

* remove query error test

* Add warning to query_error

* Add warning to resp3

* Add resp3 test

* format

* revert pytest

* resp3 tests

* Ben comments round1

* Add 0 thresh test

* comment

* ADd skip cluster

* change config and other Alon's comments

* small tess

* remove pause after
Remove duplicates in tests

* more test compression

* fix assert

* FT.PROFILE and OOM string

* Add resp2 warning

* Revert "Add resp2 warning"

This reverts commit 6b73b3c.

* Add resp2 warning

* format

* [MOD-9372] - Add GIL release in OOM wait (#6203)

release gil

* fix merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants