[2.8] [MOD-12070] Extend indexing metrics (#7669)#7778
[2.8] [MOD-12070] Extend indexing metrics (#7669)#7778redisearch-backport-pull-request[bot] merged 1 commit into2.8from
Conversation
* [MOD-12070] Extend indexing metrics (#7669) * Total num indexed metric * total docs indexed by field type * test total indexed metic * test field metric * format tests * remove empty line * remove unecessary wait * meirav comments * more comments + change metric name * expose and test geometry * Unify metric * test multi json (cherry picked from commit b346977) * remove no json * remove total indexing time (cherry picked from commit 8ea201f)
| case INDEXFLD_T_GEOMETRY: | ||
| RSGlobalStats.fieldsStats.geometryTotalDocsIndexed += toAdd; | ||
| break; | ||
| } |
There was a problem hiding this comment.
Bug: Stats not updated for dynamic multi-type fields
The switch statement in FieldsGlobalStats_UpdateFieldDocsIndexed compares fs->types against exact single-type values, but fs->types is a bitmask that can contain multiple flags for dynamic fields (as noted in code comments: "Only dynamic fields may be indexed as multiple index types"). When a dynamic field has combined types like INDEXFLD_T_TAG | INDEXFLD_T_NUMERIC, the switch won't match any case, so no statistics are recorded for such fields.
| } | ||
|
|
||
| // Since we are here, the indexing was successful, update the global statistics. | ||
| FieldsGlobalStats_UpdateFieldDocsIndexed(fs, 1); |
There was a problem hiding this comment.
Bug: Fulltext stats updated before actual indexing completes
The stats update for fulltext fields at line 519 occurs in the preprocessor phase, after tokenization but before the actual inverted index writing happens (in writeCurEntries/writeMergedEntries). In contrast, other field types update stats in IndexerBulkAdd after their indexer functions complete. If fulltext index writing fails (e.g., index dropped during processing), the stats would have already been incremented, leading to inflated metrics. The comment "indexing was successful" is misleading since only preprocessing has completed at this point.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## 2.8 #7778 +/- ##
==========================================
- Coverage 87.63% 87.62% -0.02%
==========================================
Files 203 203
Lines 35227 35266 +39
==========================================
+ Hits 30871 30901 +30
- Misses 4356 4365 +9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
Backport of #7766 to
2.8.Note
Add per-field-type indexing counters and aggregated total docs across indexes, update them during indexing, and expose via INFO MODULES with tests.
FieldsGlobalStats_UpdateFieldDocsIndexedinsrc/document.c(fulltextPreprocessor,IndexerBulkAdd).FieldsGlobalStats_UpdateFieldDocsIndexedinsrc/info/global_stats.cand declare insrc/info/global_stats.h.FieldsGlobalStatswith per-type totals:text/tag/numeric/geo/geoshape/vectorTotalDocsIndexedand vector variant fields.search_total_indexing_ops_*_fieldsoutputs inAddToInfo_Fields(src/info/info_redis.c).search_total_num_docs_in_indexesviaTotalIndexesInfo.total_num_docs_in_indexes(collected insrc/info/indexes_info.c, surfaced inAddToInfo_Indexes).total_num_docs_in_indexesand per-fieldtotal_indexing_ops_*_fields, including multi-field docs and multi-value JSON (tests/pytests/test_info_modules.py).Written by Cursor Bugbot for commit 59df50b. This will update automatically on new commits. Configure here.