Skip to content

Conversation

@oasisk
Copy link
Contributor

@oasisk oasisk commented Dec 4, 2025

Design at:

streams-corelation/SERVICE_CORRELATION_ARCHITECTURE.md

@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2025

Failed to generate code suggestions for PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 4, 2025

Greptile Overview

Greptile Summary

This PR implements a comprehensive service discovery feature that automatically discovers services from telemetry data and enables correlation across logs, traces, and metrics. The implementation includes database migrations, backend processing in both ingester and compactor modes, HTTP APIs for correlation and analytics, and a full frontend UI with Vue components.

Key Changes

  • Database schema: Two new tables (service_streams and service_streams_dimensions) for storing discovered services and tracking dimension cardinality
  • Service extraction: Automatic extraction during parquet processing with hash-based sampling, cardinality protection, and correlation key deduplication
  • HTTP APIs: /service_streams/_correlate for finding related telemetry and /_analytics for dimension insights
  • Semantic groups: Management endpoints for configurable field name mappings with import/export and diff preview
  • Frontend: New correlation panel component, composables for service correlation, and integration into log/trace/metric views
  • Metrics: Comprehensive Prometheus metrics for monitoring service discovery operations
  • Enterprise feature: Properly gated behind enterprise feature flag throughout

Critical Issue Found

The database migration m20251126_100001_create_service_streams_table.rs is missing the correlation_key column that is defined in the Model struct and used extensively in queries. This will cause runtime failures when the feature is used.

Architecture Highlights

  • Dual-mode processing: Works in both ingester (real-time) and compactor (batch) modes
  • Cardinality protection: Automatically blocks high-cardinality dimensions to prevent database explosion
  • Correlation key design: Uses hash of stable dimensions only to deduplicate services and maintain manageable DB size
  • Hash-based sampling: Stateless per-stream-type sampling avoids coordination overhead
  • Batch processing: Services are queued and written in batches to minimize database load

Confidence Score: 2/5

  • This PR cannot be safely merged due to a critical database schema mismatch that will cause runtime failures
  • Score reflects a blocking issue: the migration is missing the correlation_key column that the code depends on. Without this column, database queries will fail at runtime. The rest of the implementation is well-structured with proper enterprise gating, comprehensive metrics, good separation of concerns, and thoughtful architecture for cardinality management. Once the schema issue is fixed, this would be a solid 4/5.
  • src/infra/src/table/migration/m20251126_100001_create_service_streams_table.rs requires immediate fix to add missing correlation_key column

Important Files Changed

File Analysis

Filename Score Overview
src/infra/src/table/migration/m20251126_100001_create_service_streams_table.rs 1/5 Creates service_streams table but missing critical correlation_key column required by Model
src/infra/src/table/service_streams.rs 3/5 Service streams storage layer with correlation_key deduplication logic, well-structured with tests
src/config/src/meta/service_streams.rs 5/5 Clean type definitions for service discovery with comprehensive tests and good documentation
src/job/files/parquet.rs 4/5 Adds service discovery extraction during parquet processing in ingester mode with proper sampling
src/service/compact/merge.rs 4/5 Adds service discovery processing in compactor mode with record batch conversion logic
src/handler/http/request/service_streams/mod.rs 5/5 Well-documented HTTP handlers for correlate and analytics endpoints with proper enterprise gating

Sequence Diagram

sequenceDiagram
    participant User
    participant UI as Frontend UI
    participant API as API Handler
    participant Ingester
    participant Compactor
    participant Processor as Stream Processor
    participant BatchProc as Batch Processor
    participant DB as Database
    participant Cache

    Note over User,Cache: Service Discovery Flow

    rect rgb(240, 248, 255)
        Note over Ingester,DB: Data Ingestion Path (Ingester Mode)
        Ingester->>Ingester: Process parquet file
        Ingester->>Ingester: Check sampling (hash-based)
        alt Should process file
            Ingester->>Processor: Extract services from parquet
            Processor->>Processor: Read record batches
            Processor->>Processor: Extract semantic dimensions
            Processor->>Processor: Check cardinality limits
            Processor->>Processor: Generate correlation_key (stable dims)
            Processor->>BatchProc: Queue services for batch write
            BatchProc->>DB: Upsert services (by correlation_key)
            BatchProc->>DB: Track dimension cardinality
        end
    end

    rect rgb(255, 248, 240)
        Note over Compactor,DB: Data Compaction Path (Compactor Mode)
        Compactor->>Compactor: Merge parquet files
        Compactor->>Processor: Extract services from merged data
        Processor->>Processor: Convert RecordBatch to HashMap
        Processor->>Processor: Process records (same as ingester)
        Processor->>BatchProc: Queue services
        BatchProc->>DB: Upsert services
    end

    rect rgb(240, 255, 240)
        Note over User,Cache: Telemetry Correlation Flow
        User->>UI: Click log/trace/metric row
        UI->>UI: Extract available dimensions
        UI->>API: POST /service_streams/_correlate
        API->>DB: Query by dimensions (correlation_key)
        DB->>API: Return matched service + streams
        API->>API: Build CorrelationResponse
        API->>API: Separate matched vs additional dims
        API->>UI: Return related streams
        UI->>UI: Display correlation panel
        User->>UI: Click "View" on related stream
        UI->>UI: Navigate with filters applied
    end

    rect rgb(255, 240, 255)
        Note over User,DB: Analytics & Configuration
        User->>UI: View dimension analytics
        UI->>API: GET /service_streams/_analytics
        API->>DB: Calculate cardinality per dimension
        API->>API: Classify dimensions (VeryLow/Low/Medium/High/VeryHigh)
        API->>UI: Return analytics summary
        UI->>UI: Display recommended dimensions
        
        User->>UI: Import semantic groups
        UI->>API: POST /alerts/deduplication/semantic-groups/preview-diff
        API->>DB: Get current semantic groups
        API->>API: Compare with imported groups
        API->>UI: Return additions/modifications/unchanged
        User->>UI: Confirm import
        UI->>API: PUT /alerts/deduplication/semantic-groups
        API->>DB: Save semantic groups
        API->>Cache: Invalidate cache
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

47 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@oasisk oasisk force-pushed the feat_service_discovery branch from 32e2b13 to 5dbcb8b Compare December 4, 2025 06:55
@oasisk oasisk requested a review from ByteBaker December 4, 2025 07:15
@oasisk oasisk force-pushed the feat_service_discovery branch from dc0dddb to bc1e5ff Compare December 4, 2025 07:49
@oasisk oasisk merged commit 1353ea6 into main Dec 4, 2025
38 checks passed
@oasisk oasisk deleted the feat_service_discovery branch December 4, 2025 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants