Skip to content

[MOD-8034] expose indexing failures to INFO modules#5139

Merged
meiravgri merged 10 commits intomasterfrom
meiravg_mod7955_expose_index_errors
Nov 4, 2024
Merged

[MOD-8034] expose indexing failures to INFO modules#5139
meiravgri merged 10 commits intomasterfrom
meiravg_mod7955_expose_index_errors

Conversation

@meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented Oct 30, 2024

This PR exposes indexing errors counter to INFO MODULES

search_errors_* section

added fields:
errors_indexing_failures - reflects the total number of indexing errors in all living indices. One document may cause failure in multiple indices.
errors_indexing_failures_max

searc_field_FieldType

  • Introduce a global counter for indexing failures for each field type.
  • introduce FieldSpec_AddError that adds an error to a specific field object, and also updates the global per-field-type error counter.

The global counter is updated at runtime, once the error occurred. (as opposed to saving a counter per index and aggregating it per INFO call)
The counter should be decreased whenever an index is dropped.

We expose the global counter value in INFO modules, along with each field type stats.

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

move  dialects stats to global_stats
The counter is update at runtime, once the error occured.
It reflects the total number of indexing errors in all living indices.
The counter should be decreased whenever an index is dropped.

introduce FieldSpec_AddError that adds an error to a specific field.
it also updates the global field error counter.

We expose the global counter value in INFO modules, along with each field type stats.

the name of the fields count for each type was changed from "Field_type_name" to "Total"
@codecov
Copy link

codecov bot commented Oct 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.35%. Comparing base (97f34e2) to head (af978af).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5139   +/-   ##
=======================================
  Coverage   86.35%   86.35%           
=======================================
  Files         190      190           
  Lines       34642    34668   +26     
=======================================
+ Hits        29915    29938   +23     
- Misses       4727     4730    +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wait for bg index after FT.ALTER to avoid flakyness
@meiravgri meiravgri changed the title expose indexing failures to INFO modules [MOD-8034] expose indexing failures to INFO modules Nov 3, 2024
GuyAv46
GuyAv46 previously approved these changes Nov 3, 2024
Copy link
Collaborator

@alonre24 alonre24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Few comments

FieldsGlobalStats_UpdateStats(spec->fields + i, -1);
FieldSpec *field = spec->fields + i;
FieldsGlobalStats_UpdateStats(field, -1);
FieldsGlobalStats_UpdateIndexError(field->types, -FieldSpec_GetIndexErrorCount(field));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that actually? Total number of errors did not decrease

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind. I'm convinced - let's go with this approach :)

Comment on lines 958 to 960
if (info.indexing_errors.max_indexing_failures < index_error_count) {
info.indexing_errors.max_indexing_failures = index_error_count;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we initialize max_indexing_failures before the loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialized to 0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you're right because indexing_errors is a nested struct

Comment on lines +335 to +337
env.assertEqual(info['num_docs'], 0)
env.assertEqual(info['field statistics'][0], expected_failed_field_stats)
env.assertEqual(info['field statistics'][1], expected_no_error_field_stats)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why that is correct, but looking at it now, it might be misleading... Perhaps it would be better if we at least increase the indexing failures on each field in the spec anyway once we encounter an error in one of the fields, so it won't look like a partial indexing was made. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes the per-field indexing error counter redundant, as a we would like to identify field-types with higher errors amount

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see a reason why it may look like partial indexing since the error is counted in the index level as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. How about trying to index each field instead of failing fast and setting the error counter for the bad fields? It would increase latency but in error code path we should care less. How big of a change is that?

remove IndexingErrorsStats and put the fields in TotalSpecsInfo
alonre24
alonre24 previously approved these changes Nov 4, 2024
@meiravgri meiravgri enabled auto-merge November 4, 2024 15:52
@meiravgri meiravgri added this pull request to the merge queue Nov 4, 2024
Merged via the queue into master with commit 9ef9bd9 Nov 4, 2024
@meiravgri meiravgri deleted the meiravg_mod7955_expose_index_errors branch November 4, 2024 17:09
@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.10, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.10
git worktree add -d .worktree/backport-5139-to-2.10 origin/2.10
cd .worktree/backport-5139-to-2.10
git switch --create backport-5139-to-2.10
git cherry-pick -x 9ef9bd96d0ec8c1cdeb5dc92f38da4d83e7ea9f7

redisearch-backport-pull-request bot pushed a commit that referenced this pull request Nov 4, 2024
* move fieldsStats from RSGlobalConfig to RSGlobalStats

move  dialects stats to global_stats

* Introduce a global counter for indexing failures for each field type.
The counter is update at runtime, once the error occured.
It reflects the total number of indexing errors in all living indices.
The counter should be decreased whenever an index is dropped.

introduce FieldSpec_AddError that adds an error to a specific field.
it also updates the global field error counter.

We expose the global counter value in INFO modules, along with each field type stats.

the name of the fields count for each type was changed from "Field_type_name" to "Total"

* fix test

* skip test_redis_info_errors in cluster

wait for bg index after FT.ALTER to avoid flakyness

* revert change in INFO MODULES field count name

* fix test format

* fix test flakyness

remove IndexingErrorsStats and put the fields in TotalSpecsInfo

(cherry picked from commit 9ef9bd9)
@redisearch-backport-pull-request
Copy link
Contributor

Successfully created backport PR for 8.0:

meiravgri added a commit that referenced this pull request Nov 5, 2024
* move fieldsStats from RSGlobalConfig to RSGlobalStats

move  dialects stats to global_stats

* Introduce a global counter for indexing failures for each field type.
The counter is update at runtime, once the error occured.
It reflects the total number of indexing errors in all living indices.
The counter should be decreased whenever an index is dropped.

introduce FieldSpec_AddError that adds an error to a specific field.
it also updates the global field error counter.

We expose the global counter value in INFO modules, along with each field type stats.

the name of the fields count for each type was changed from "Field_type_name" to "Total"

* fix test

* skip test_redis_info_errors in cluster

wait for bg index after FT.ALTER to avoid flakyness

* revert change in INFO MODULES field count name

* fix test format

* fix test flakyness

remove IndexingErrorsStats and put the fields in TotalSpecsInfo

(cherry picked from commit 9ef9bd9)
github-merge-queue bot pushed a commit that referenced this pull request Nov 5, 2024
[MOD-8034] expose indexing failures to INFO modules (#5139)

* move fieldsStats from RSGlobalConfig to RSGlobalStats

move  dialects stats to global_stats

* Introduce a global counter for indexing failures for each field type.
The counter is update at runtime, once the error occured.
It reflects the total number of indexing errors in all living indices.
The counter should be decreased whenever an index is dropped.

introduce FieldSpec_AddError that adds an error to a specific field.
it also updates the global field error counter.

We expose the global counter value in INFO modules, along with each field type stats.

the name of the fields count for each type was changed from "Field_type_name" to "Total"

* fix test

* skip test_redis_info_errors in cluster

wait for bg index after FT.ALTER to avoid flakyness

* revert change in INFO MODULES field count name

* fix test format

* fix test flakyness

remove IndexingErrorsStats and put the fields in TotalSpecsInfo

(cherry picked from commit 9ef9bd9)

Co-authored-by: meiravgri <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Nov 5, 2024
* [MOD-8034] expose indexing failures to INFO modules (#5139)

* move fieldsStats from RSGlobalConfig to RSGlobalStats

move  dialects stats to global_stats

* Introduce a global counter for indexing failures for each field type.
The counter is update at runtime, once the error occured.
It reflects the total number of indexing errors in all living indices.
The counter should be decreased whenever an index is dropped.

introduce FieldSpec_AddError that adds an error to a specific field.
it also updates the global field error counter.

We expose the global counter value in INFO modules, along with each field type stats.

the name of the fields count for each type was changed from "Field_type_name" to "Total"

* fix test

* skip test_redis_info_errors in cluster

wait for bg index after FT.ALTER to avoid flakyness

* revert change in INFO MODULES field count name

* fix test format

* fix test flakyness

remove IndexingErrorsStats and put the fields in TotalSpecsInfo

(cherry picked from commit 9ef9bd9)

* add geoshape to test_info_modules.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants