-
Notifications
You must be signed in to change notification settings - Fork 715
perf: improve high frequency term search speed #7804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements significant performance improvements for high-frequency term search operations in Tantivy, along with code cleanup and removal of deprecated functionality.
Core Performance Improvements:
The main enhancement is in src/service/search/grpc/storage.rs where the filter_file_list_by_tantivy_index function has been completely restructured. The previous implementation used try_join_all which waited for all parallel tasks to complete before processing results. The new approach uses streaming with buffer_unordered to process results as they become available, enabling early termination when search operations become inefficient.
Key technical changes include:
- Adaptive threshold mechanism: The system now monitors the number of row IDs returned and can terminate searches early if too many files return excessive results, preventing system overload
- Streaming task processing: Results are processed incrementally rather than waiting for all tasks to complete
- Improved file grouping: The logic has been restructured from group-first to file-first iteration for better parallelization
- Enhanced concurrency control: Semaphore acquisition has been moved inside tasks for better resource management
- New utility functions:
regroup_tantivy_filesandinto_chunkshave been added to support the new execution model
Code Cleanup:
The PR removes deprecated functionality including the full_text_search_type configuration field that was scheduled for removal in version 0.15.0. This cleanup extends to the search pipeline where deprecated prefix search handling has been removed from the flight service. Additionally, minor formatting improvements have been made to test code and function signatures for better readability.
These changes integrate with OpenObserve's existing search infrastructure while maintaining backward compatibility for the search API. The performance improvements specifically target high-frequency term scenarios where the previous implementation could become a bottleneck.
Confidence score: 4/5
- This PR appears safe to merge with significant performance benefits and proper cleanup of deprecated code
- The confidence score reflects the complexity of the Tantivy search changes which, while well-structured, involve substantial modifications to critical search functionality
- The
src/service/search/grpc/storage.rsfile needs careful attention due to the significant algorithmic changes in the search processing logic
5 files reviewed, no comments
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
- [x] remove ZO_FEATURE_QUERY_NOT_FILTER_WITH_INDEX env - [x] due to this pr #7804, this part of code is not need
tantivy mode
In OpenObserve, Tantivy is used in three main ways:
No row_ids returned(
IndexOptimizeModeexceptSimpleSelect):Row_ids returned with
SimpleSelect:Row_ids returned without
SimpleSelect:This PR optimizes the third scenario. Specifically, if
(row_ids / total_ids)of cpu_num files exceedsZO_INVERTED_INDEX_SKIP_THRESHOLD, tantivy search will be skipped to avoid performance degradation.test
data size:
300GB, compression size:2.1GB, index size7.7GBfull text field:
log, messagesecondary index field:
k8s_namespace_name, k8s_pod_name, k8s_container_name, codethe filter
str_match(kubernetes_namespace_name, 'ziox')can filter out 48% row_ids;main branch
tooks:
1.6sthis branch
tooks:
850msdirectly use datafusion
tooks:
700ms