Skip to content

Conversation

@Subhra264
Copy link
Contributor

@Subhra264 Subhra264 commented Oct 24, 2025

User description

In a multi-region super cluster setup, when a node starts up and caches
enrichment tables, it generally queries all regions for enrichment table
data. However, if user is sure that enrichment table is stored in one
primary region, then querying all regions is inefficient and might even
fail if other region queriers return error for some reason. Which can
eventually cause node start failure.

Solution

This commit adds a new apply_primary_region_if_specified boolean
parameter throughout the enrichment table data fetching flow to control
whether to use only the primary region when fetching data.

Key Changes

  1. New Parameter Added (enrichment_table.rs:40, mod.rs:190):

    • Added apply_primary_region_if_specified boolean parameter to:
      • get_enrichment_table_data()
      • get_enrichment_table()
      • get_enrichment_table_json()
  2. Primary Region Logic (enrichment_table.rs:52-68):

  • When apply_primary_region_if_specified is true and enterprise
    features are enabled:
    • Checks if super cluster is enabled
    • Reads the enrichment_table_get_region configuration
  • If a primary region is specified, uses only that region for the search
    request
    • Otherwise, uses empty regions array (queries all regions)
  1. Cache Initialization (schema.rs:588-591):
  • When caching enrichment tables during node startup
    (cache_enrichment_tables()), passes true for the parameter
  • This ensures node startup only queries the primary region for
    enrichment data
  1. Runtime Watch Operations (enrichment_table.rs:310-328):

    • Watch operations pass false for the parameter
    • This maintains existing behavior for runtime updates
  2. Code Cleanup:

    • Removed unused get() function (lines 105-146 deleted)
    • Simplified logging by removing redundant empty/non-empty checks
    • Added better debug logging showing number of records fetched

Impact

  • Performance: Node startup is faster as it only queries the
    designated primary region for enrichment table data
  • Correctness: Ensures enrichment tables are fetched from the
    authoritative source (primary region) during initialization
  • Flexibility: Runtime operations can still query all regions if
    needed

Configuration Required

This feature requires enterprise edition with super cluster enabled and
the enrichment_table_get_region configuration set to specify which
region should be considered the primary source for enrichment table
data.


PR Type

Enhancement


Description

  • Add primary-region-aware enrichment fetch

  • Thread control flag across fetch APIs

  • Use primary region on node startup

  • Preserve all-regions for watch updates


Diagram Walkthrough

flowchart LR
  A["schema::cache_enrichment_tables()"] -- "get_enrichment_table(..., true)" --> B["enrichment::get_enrichment_table_inner(..., true)"]
  B -- "get_enrichment_table_data(..., true)" --> C["db::enrichment_table::search_inner(regions=primary)"]
  D["db::watch()"] -- "get_enrichment_table(..., false)" --> B
Loading

File Walkthrough

Relevant files
Enhancement
enrichment_table.rs
Region-aware enrichment data retrieval and watch updates 

src/service/db/enrichment_table.rs

  • Add _apply_primary_region_if_specified to get_enrichment_table_data
  • Determine regions based on enterprise config and flag
  • Call search_inner with computed regions and enable consistency flag
  • Update watch flow to call enrichment getter with false
+35/-6   
schema.rs
Startup caching uses primary region when specified             

src/service/db/schema.rs

  • Cache startup uses get_enrichment_table(..., true)
  • Add comment clarifying primary-region-only behavior
+4/-1     
mod.rs
Plumb primary-region flag through enrichment APIs               

src/service/enrichment/mod.rs

  • Plumb apply_primary_region_if_specified through public APIs
  • Document flag semantics and usage caveats
  • Pass flag to DB layer when fetching remote data
+16/-3   

@github-actions
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Behavior Change

The call to search_cluster::search_inner now passes regions derived from config and sets the fifth argument to true (previously false). Validate that this flag change does not alter pagination, consistency, or aggregation semantics unexpectedly and that empty regions still implies querying all regions.

                None,
                None,
            ),
        )
        .await;
}

let result =
    search_cluster::search_inner(request, search_query, regions, vec![], true, None).await;
API Change

get_enrichment_table and get_enrichment_table_inner gained a boolean parameter. Ensure all internal callers are updated accordingly and that default behavior for runtime paths passes false to preserve previous multi-region behavior.

/// Retrieve enrichment table data.
///
/// Be careful with `apply_primary_region_if_specified` boolean. If this value is true and the
/// primary region is specified, this will fetch enrichment table data only from the specified
/// primary region and will ignore the other regions. Ideally only for cache_enrichment_table
/// function used when starting a node, it should be used.
pub async fn get_enrichment_table(
    org_id: &str,
    table_name: &str,
    apply_primary_region_if_specified: bool,
) -> Result<Arc<Vec<vrl::value::Value>>, anyhow::Error> {
    let value_type =
        get_enrichment_table_inner(org_id, table_name, apply_primary_region_if_specified).await?;
    value_type.to_vrl()
}
Data Freshness

Startup caching now forces primary-region-only fetch. Confirm that environments without a configured primary region still fall back to querying all regions and that this does not miss data for tables intentionally replicated across regions.

// Only use the primary region if specified to fetch enrichment table data assuming only the
// primary region contains the data.
let data =
    super::super::enrichment::get_enrichment_table(&tbl.org_id, &tbl.stream_name, true)
        .await?;
let len = data.len();

@github-actions
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Gate behavior by feature flag

Passing true for the new boolean alters behavior globally; ensure it is tied to
_apply_primary_region_if_specified. Otherwise, non-enterprise builds (where regions
is empty) may still enable alternate execution paths unexpectedly.

src/service/db/enrichment_table.rs [123]

+let enforce_region = cfg!(feature = "enterprise") && _apply_primary_region_if_specified;
 let result =
-    search_cluster::search_inner(request, search_query, regions, vec![], true, None).await;
+    search_cluster::search_inner(request, search_query, regions, vec![], enforce_region, None).await;
Suggestion importance[1-10]: 7

__

Why: Tying the boolean to both the enterprise feature and _apply_primary_region_if_specified better matches intent and avoids unintended behavior in non-enterprise builds where regions is empty. It's contextually accurate and improves correctness, though not critical.

Medium
Prevent unintended local fallback

Ensure the primary region is not redundantly re-fetched locally by disabling
fallback when a region is explicitly chosen. Returning an empty list here can cause
search_inner to silently fall back to local search. Instead, propagate an explicit
flag to prevent fallback when targeting a specific region.

src/service/db/enrichment_table.rs [66-80]

 let regions = {
     use o2_enterprise::enterprise::common::config::get_config as get_o2_config;
     let config = get_o2_config();
     let enrichment_table_region = config.super_cluster.enrichment_table_get_region.clone();
     if _apply_primary_region_if_specified
         && config.super_cluster.enabled
         && !enrichment_table_region.is_empty()
     {
         vec![enrichment_table_region]
     } else {
-        vec![]
+        // Use a sentinel value to indicate "no specific region", preserving prior behavior.
+        Vec::<String>::new()
     }
 };
Suggestion importance[1-10]: 3

__

Why: The suggestion is mostly a restatement; Vec::<String>::new() is equivalent to vec![] and does not address fallback semantics, which are controlled by the later boolean argument to search_inner. Impact is minor and the improved_code doesn't change behavior.

Low

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR optimizes enrichment table caching during node startup in multi-region super cluster setups by adding a boolean parameter to control whether to query only the primary region.

Key Changes

  • Added apply_primary_region_if_specified boolean parameter throughout the enrichment table data fetching flow
  • When true and a primary region is configured, the system queries only that region instead of all regions
  • Node startup (cache_enrichment_tables) uses true to optimize initialization
  • Runtime watch operations use false to maintain existing behavior

Impact

  • Performance: Faster node startup by avoiding unnecessary cross-region queries
  • Reliability: Prevents startup failures caused by errors from non-primary regions
  • Flexibility: Runtime operations can still query all regions when needed

Issues Found

  • Minor style issue: parameter named with underscore prefix but is actually used (line 52 in enrichment_table.rs)

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk after addressing the minor style issue
  • The implementation is straightforward and well-documented. The logic correctly threads the new parameter through the call chain and conditionally applies primary region filtering based on enterprise configuration. Only one minor style issue was found regarding underscore-prefixed parameter naming. The change addresses a real performance/reliability concern in multi-region deployments.
  • src/service/db/enrichment_table.rs needs a minor style fix to remove underscore prefix from the parameter name

Important Files Changed

File Analysis

Filename Score Overview
src/service/db/enrichment_table.rs 4/5 Added primary region parameter to control data fetching scope. Logic is sound but parameter naming has minor style issue.
src/service/db/schema.rs 5/5 Updated cache initialization to use primary region for enrichment tables with clear explanatory comment.
src/service/enrichment/mod.rs 5/5 Threaded new parameter through enrichment table retrieval functions with good documentation.

Sequence Diagram

sequenceDiagram
    participant Node as Node Startup
    participant Schema as schema.rs::cache_enrichment_tables()
    participant Enrich as enrichment/mod.rs
    participant DB as enrichment_table.rs
    participant Search as search_cluster::search_inner()
    participant Config as Enterprise Config
    
    Note over Node,Config: Node Startup Flow (apply_primary_region_if_specified=true)
    Node->>Schema: cache_enrichment_tables()
    Schema->>Enrich: get_enrichment_table(org_id, name, true)
    Enrich->>Enrich: get_enrichment_table_inner(org_id, name, true)
    Enrich->>DB: get_enrichment_table_data(org_id, name, true)
    DB->>Config: Check super_cluster.enabled & enrichment_table_get_region
    Config-->>DB: Returns primary region if specified
    alt Primary region specified
        DB->>Search: search_inner(..., regions=[primary_region], ...)
        Search-->>DB: Query only primary region
    else No primary region
        DB->>Search: search_inner(..., regions=[], ...)
        Search-->>DB: Query all regions
    end
    DB-->>Enrich: Return enrichment data
    Enrich-->>Schema: Return enrichment table
    Schema->>Schema: Cache enrichment table
    
    Note over Node,Config: Runtime Watch Flow (apply_primary_region_if_specified=false)
    DB->>DB: watch() detects change
    DB->>Enrich: get_enrichment_table(org_id, name, false)
    Enrich->>DB: get_enrichment_table_data(org_id, name, false)
    DB->>Search: search_inner(..., regions=[], ...)
    Search-->>DB: Query all regions
    DB-->>DB: Update cache
Loading

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: Subhra264 | Branch: sc/enrichment_region_main | Commit: 1c86a23

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 345 0 19 1 95% 4m 39s

View Detailed Results

@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: Subhra264 | Branch: sc/enrichment_region_main | Commit: 1c86a23

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 342 0 19 4 94% 7m 5s

View Detailed Results

…les on node start (#8816)

In a multi-region super cluster setup, when a node starts up and caches
enrichment tables, it generally queries all regions for enrichment table
data. However, if user is sure that enrichment table is stored in one
primary region, then querying all regions is inefficient and might even
fail if other region queriers return error for some reason. Which can
eventually cause node start failure.

This commit adds a new `apply_primary_region_if_specified` boolean
parameter throughout the enrichment table data fetching flow to control
whether to use only the primary region when fetching data.

1. **New Parameter Added** (`enrichment_table.rs:40`, `mod.rs:190`):
   - Added `apply_primary_region_if_specified` boolean parameter to:
     - `get_enrichment_table_data()`
     - `get_enrichment_table()`
     - `get_enrichment_table_json()`

2. **Primary Region Logic** (`enrichment_table.rs:52-68`):
- When `apply_primary_region_if_specified` is true and enterprise
features are enabled:
     - Checks if super cluster is enabled
     - Reads the `enrichment_table_get_region` configuration
- If a primary region is specified, uses only that region for the search
request
   - Otherwise, uses empty regions array (queries all regions)

3. **Cache Initialization** (`schema.rs:588-591`):
- When caching enrichment tables during node startup
(`cache_enrichment_tables()`), passes `true` for the parameter
- This ensures node startup only queries the primary region for
enrichment data

4. **Runtime Watch Operations** (`enrichment_table.rs:310-328`):
   - Watch operations pass `false` for the parameter
   - This maintains existing behavior for runtime updates

5. **Code Cleanup**:
   - Removed unused `get()` function (lines 105-146 deleted)
   - Simplified logging by removing redundant empty/non-empty checks
   - Added better debug logging showing number of records fetched

- **Performance**: Node startup is faster as it only queries the
designated primary region for enrichment table data
- **Correctness**: Ensures enrichment tables are fetched from the
authoritative source (primary region) during initialization
- **Flexibility**: Runtime operations can still query all regions if
needed

This feature requires enterprise edition with super cluster enabled and
the `enrichment_table_get_region` configuration set to specify which
region should be considered the primary source for enrichment table
data.
@Subhra264 Subhra264 force-pushed the sc/enrichment_region_main branch from 1c86a23 to 01219cd Compare October 24, 2025 05:29
@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: Subhra264 | Branch: sc/enrichment_region_main | Commit: 01219cd

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 339 0 19 7 93% 4m 39s

View Detailed Results

@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: Subhra264 | Branch: sc/enrichment_region_main | Commit: 17f25d3

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 343 0 19 3 94% 4m 40s

View Detailed Results

@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: Subhra264 | Branch: sc/enrichment_region_main | Commit: 1385c9a

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 345 0 19 1 95% 4m 38s

View Detailed Results

@Subhra264 Subhra264 merged commit 5790288 into main Oct 24, 2025
33 checks passed
@Subhra264 Subhra264 deleted the sc/enrichment_region_main branch October 24, 2025 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants