Skip to content

Conversation

@hengfeiyang
Copy link
Contributor

@hengfeiyang hengfeiyang commented Dec 5, 2025

  • fix: result cache metadata for histogram
  • fix: partition generate for histogram
  • fix: cache filter for histogram
  • feat: no longer delete agg cache file, the solution is only use agg cache file which interval not greater than search interval

@github-actions github-actions bot added the ☢️ Bug Something isn't working label Dec 5, 2025
@hengfeiyang hengfeiyang added the Needs-Testing Needs-Testing label Dec 5, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

Failed to generate code suggestions for PR

@hengfeiyang hengfeiyang merged commit 43dc49f into branch-v0.20.0 Dec 5, 2025
13 of 16 checks passed
@hengfeiyang hengfeiyang deleted the hotfix/result-cache branch December 5, 2025 01:04
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 5, 2025

Greptile Overview

Greptile Summary

This PR fixes result cache handling for histogram queries by addressing several issues:

  • Histogram interval awareness in cache deltas: The calculate_deltas function now accounts for histogram bucket widths when computing cache misses, avoiding duplicate data at interval boundaries
  • Improved cache hit filtering: Cached hits are now filtered using hit_ts + histogram_interval to ensure histogram buckets that span the query boundary are properly excluded
  • Cache time range alignment: Changed from using data-derived timestamps (final_start_time/final_end_time) to query-aligned timestamps (accept_start_time/accept_end_time) for cache storage, with histogram interval adjustment at end time
  • Partition alignment relaxation: Reduced the threshold for histogram alignment from min_step * 3 to min_step, allowing more queries to benefit from aligned partitions
  • Removed enterprise-only mismatched interval cleanup: The delete_mismatched_interval_cache_files code was removed from the streaming execution path
  • Enhanced logging: Added elapsed time and partition details to various log messages for better debugging

Confidence Score: 4/5

  • This PR is safe to merge - the changes are targeted fixes to histogram caching logic with clear intent and no breaking API changes.
  • The changes are well-focused on fixing histogram cache handling. The logic changes are straightforward and improve correctness. Minor risk around the partition threshold change and removed enterprise cleanup code, but these appear intentional for the fix.
  • src/service/search/cache/mod.rs - Core cache write logic changes. src/service/search/streaming/execution.rs - Removed enterprise cache cleanup code.

Important Files Changed

File Analysis

Filename Score Overview
src/service/search/cache/cacher.rs 4/5 Fixed histogram interval handling: added interval param to calculate_deltas, improved filtering to account for histogram bucket widths, and only pushes cached responses when there's actual data.
src/service/search/cache/mod.rs 4/5 Refactored cache time range handling to use accept_start_time/accept_end_time instead of data-derived times. Added histogram end time adjustment and improved logging.
src/service/search/partition.rs 4/5 Relaxed alignment threshold from min_step * 3 to min_step for partition boundaries. Moved mini_partition_size calculation inside conditional to avoid unnecessary computation.
src/service/search/streaming/execution.rs 4/5 Removed mismatched-interval cache file deletion code for clear_cache. Enhanced debug logging with partition details.

Sequence Diagram

sequenceDiagram
    participant Client
    participant SearchService
    participant CacheChecker
    participant CacheStorage
    participant PartitionGen
    
    Client->>SearchService: search request (histogram query)
    SearchService->>CacheChecker: check_cache()
    CacheChecker->>CacheStorage: get_cached_results()
    CacheStorage-->>CacheChecker: cached response
    
    Note over CacheChecker: Filter hits with histogram_interval<br/>hit_ts + interval < end_time
    
    CacheChecker->>CacheChecker: calculate_deltas(histogram_interval)
    
    Note over CacheChecker: For histogram queries:<br/>delta_start_time = result.end_time<br/>(no +1 offset)
    
    CacheChecker-->>SearchService: deltas + cached data
    
    alt Has deltas to fetch
        SearchService->>PartitionGen: generate_partitions()
        Note over PartitionGen: Alignment threshold:<br/>duration > min_step<br/>(was min_step * 3)
        PartitionGen-->>SearchService: partitions
        SearchService->>SearchService: execute delta queries
    end
    
    SearchService->>CacheStorage: write_results()
    Note over CacheStorage: Use accept_start/end_time<br/>Adjust end_time by histogram interval
    
    SearchService-->>Client: merged response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@hengfeiyang hengfeiyang changed the title fix: result cache for histogram (#9483) fix: result cache for histogram Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

☢️ Bug Something isn't working Needs-Testing Needs-Testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants