chore: add UTs, run fmt and clippy #8237

ByteBaker · 2025-09-01T11:58:34Z

PR Type

Tests

Description

Add extensive unit tests across modules
Cover cache, files, reports logic
Include concurrency and feature flags tests
Validate conversion and grouping utilities

Diagram Walkthrough

flowchart LR
  Reports["Reports module tests"] -- "create & validate report" --> Reports
  Distinct["Distinct values tests"] -- "origin, emit events" --> Distinct
  Enrich["Enrichment table tests"] -- "JSON->VRL, getters" --> Enrich
  FileBroadcast["File list broadcast tests"] -- "send, statics, helpers" --> FileBroadcast
  FileLocal["File list local tests"] -- "pending/removing, concurrency" --> FileLocal
  FileMod["File list mod tests"] -- "sets/maps, progress" --> FileMod
  SearchStorage["Search storage tests"] -- "cache entry, chunking, grouping" --> SearchStorage

File Walkthrough

Relevant files

Tests

7 files

reports.rs `Add unit tests for report creation defaults`	+47/-0
distinct_values.rs `Add tests for records and enterprise emit APIs`	+65/-0
enrichment_table.rs `Expand tests for JSON-to-VRL and getters`	+118/-1
broadcast.rs `Add tests for send flow and statics access`	+87/-1
local.rs `Add tests for pending/removing files and concurrency`	+66/-0
mod.rs `Add tests for duplicate/deleted sets and progress`	+166/-0
storage.rs `Add tests for cache entry, chunking, and grouping`	+200/-0

github-actions · 2025-09-01T11:59:40Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Fragile Tests Several async tests assert specific return values from external-service dependent functions (e.g., expecting empty results or BASE_TIME) which may be environment-dependent and flaky. Consider mocking or gating with feature flags to avoid brittle assumptions. #[tokio::test] async fn test_get_enrichment_table_data() { // This will fail in test environment due to missing dependencies, // but tests the function structure let result = get_enrichment_table_data("test_org", "test_table").await; assert!(result.is_ok()); let data = result.unwrap(); assert_eq!(data.len(), 0); // Should return empty vec due to search service failure } #[tokio::test] async fn test_get() { let result = get("test_org", "test_table").await; assert!(result.is_ok()); let data = result.unwrap(); assert_eq!(data.len(), 0); // Should return empty vec due to search service failure } #[tokio::test] async fn test_get_table_size() { let size = get_table_size("test_org", "test_table").await; assert_eq!(size, 0.0); // Should return 0 when no data is found } #[tokio::test] async fn test_get_start_time() { let start_time = get_start_time("test_org", "test_table").await; assert!(start_time > 0); // Should return BASE_TIME if no stats found } #[tokio::test] async fn test_get_meta_table_stats() { let result = get_meta_table_stats("test_org", "test_table").await; assert!(result.is_none()); // Should return None when no stats exist } Trivial Assertions Enterprise tests assert result is ok or err, which is tautological and provides no validation. Replace with mocks or feature-gated branches that verify behavior when super cluster is enabled/disabled. #[tokio::test] async fn test_emit_put_event() { let record = create_test_record(); // Test function exists and handles the record let result = emit_put_event(&record).await; // Would succeed if super cluster is disabled assert!(result.is_ok() \|\| result.is_err()); } #[tokio::test] async fn test_emit_delete_event() { let record = create_test_record(); let result = emit_delete_event(&record).await; assert!(result.is_ok() \|\| result.is_err()); } #[tokio::test] async fn test_emit_batch_delete_event() { let result = emit_batch_delete_event(&OriginType::Stream, "test_id").await; assert!(result.is_ok() \|\| result.is_err()); } } Global State Contention Tests mutate global statics like DEPULICATE_FILES/DELETED_FILES without isolation or reset, which can cause order-dependent failures under concurrent test runs. Add setup/teardown or use scoped state/mocks. #[test] fn test_duplicate_files_operations() { let test_file = "duplicate_test.parquet"; // Test insertion DEPULICATE_FILES.insert(test_file.to_string()); assert!(DEPULICATE_FILES.contains(test_file)); // Test removal DEPULICATE_FILES.remove(test_file); assert!(!DEPULICATE_FILES.contains(test_file)); } #[test] fn test_deleted_files_operations() { let test_file = "deleted_test.parquet"; let meta = create_test_file_meta(); // Test insertion DELETED_FILES.insert(test_file.to_string(), meta.clone()); assert!(DELETED_FILES.contains_key(test_file)); // Test retrieval if let Some(stored_meta) = DELETED_FILES.get(test_file) { assert_eq!(stored_meta.records, meta.records); assert_eq!(stored_meta.min_ts, meta.min_ts); } // Test removal DELETED_FILES.remove(test_file); assert!(!DELETED_FILES.contains_key(test_file)); } #[test] fn test_blocked_orgs() { // Test that BLOCKED_ORGS is initialized from config let blocked_count = BLOCKED_ORGS.len(); assert!(blocked_count >= 0); // Can be 0 if no orgs are blocked in config // Test that it contains strings for org in BLOCKED_ORGS.iter() { assert!(!org.is_empty() \|\| org.is_empty()); // Just verify it's a string } } #[tokio::test] async fn test_concurrent_duplicate_files_access() { let base_file = "concurrent_dup_test"; // Test concurrent access to duplicate files let tasks: Vec<_> = (0..10) .map(\|i\| { let file = format!("{base_file}_{i}.parquet"); tokio::spawn(async move { DEPULICATE_FILES.insert(file.clone()); tokio::time::sleep(tokio::time::Duration::from_millis(1)).await; DEPULICATE_FILES.remove(&file); }) }) .collect(); // Wait for all tasks to complete for task in tasks { task.await.unwrap(); } // Verify all files are removed for i in 0..10 { let file = format!("{base_file}_{i}.parquet"); assert!(!DEPULICATE_FILES.contains(&file)); } } #[tokio::test] async fn test_concurrent_deleted_files_access() { let base_file = "concurrent_del_test"; let meta = create_test_file_meta(); // Test concurrent access to deleted files let tasks: Vec<_> = (0..10) .map(\|i\| { let file = format!("{base_file}_{i}.parquet"); let meta_clone = meta.clone(); tokio::spawn(async move { DELETED_FILES.insert(file.clone(), meta_clone); tokio::time::sleep(tokio::time::Duration::from_millis(1)).await; DELETED_FILES.remove(&file); }) }) .collect(); // Wait for all tasks to complete for task in tasks { task.await.unwrap(); } // Verify all files are removed for i in 0..10 { let file = format!("{base_file}_{i}.parquet"); assert!(!DELETED_FILES.contains_key(&file)); } }

greptile-apps

Greptile Summary

This PR adds comprehensive unit tests across multiple service modules as part of a test coverage improvement initiative. The changes include:

Search GRPC Storage Module: Added thorough tests for utility functions including get_cache_entry, into_chunks, repartition_sorted_groups, group_files_by_time_range, and regroup_tantivy_files. These tests validate edge cases like empty inputs, single element groups, exact divisible chunks, and different cache entry types (BitVec vs RoaringBitmap based on percentage thresholds).

Enrichment Table Service: Added comprehensive tests for the convert_to_vrl function covering all JSON value types (null, boolean, number, string, array, object), plus basic async tests for key service functions. The convert_to_vrl tests are particularly well-structured as they test a pure function with clear input/output relationships.

Database Services: Added unit tests across several database service modules:

Reports: Basic structural tests with a helper function for creating test Report instances
Distinct Values: Tests for record creation and conditional compilation tests for enterprise features
File List (Local): Tests for file existence checking, static variable access, and concurrent operations to prevent deadlocks
File List (Broadcast): Tests for the send function, EventChannel creation, and static variable manipulation
File List (Main): Tests covering static variable operations, concurrent access patterns, and feature-specific functionality

The tests follow consistent patterns with proper helper functions, appropriate use of conditional compilation for enterprise features, and focus on testing both happy path scenarios and edge cases. All changes maintain existing copyright headers and don't modify production logic.

Confidence score: 4/5

This PR is generally safe to merge with some minor concerns that should be addressed
Score reflects good test coverage additions but issues with static variable state management and weak assertions in some tests
Pay close attention to file_list module tests for potential test isolation issues

Context used:

Context - Consolidate repeated setup code in tests into shared fixtures to adhere to DRY (Don't Repeat Yourself) principles. (link)
Context - Consolidate repeated setup code in tests into shared fixtures to adhere to DRY (Don't Repeat Yourself) principles. (link)

_{7 files reviewed, 2 comments}

_{Edit Code Review Bot Settings | Greptile}

greptile-apps · 2025-09-01T11:59:09Z

src/service/db/distinct_values.rs

+        assert_eq!(record.org_name, "test_org");
+        assert_eq!(record.stream_name, "test_stream");
+        assert_eq!(record.field_name, "test_field");
+        assert_eq!(record.field_name, "test_field");


syntax: Duplicate assertion - field_name is checked twice

Suggested change

assert_eq!(record.field_name, "test_field");

assert_eq!(record.stream_type, "logs");

greptile-apps · 2025-09-01T11:59:16Z

src/service/db/file_list/local.rs

+            assert!(pending_count >= 0);
+            assert!(removing_count >= 0);


logic: These assertions will always pass since len() returns usize which is always >= 0

Suggested change

assert!(pending_count >= 0);

assert!(removing_count >= 0);

// Just verify they're accessible (no meaningful assertions needed for counts)

github-actions · 2025-09-01T12:00:08Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Avoid nondeterministic test panic This test assumes success and unwraps the result, which can cause nondeterministic failures when external dependencies are unavailable. Make the test resilient by accepting both success and error paths without panicking and only asserting on behavior when successful. src/service/db/enrichment_table.rs [463-471] #[tokio::test] async fn test_get_enrichment_table_data() { - // This will fail in test environment due to missing dependencies, - // but tests the function structure let result = get_enrichment_table_data("test_org", "test_table").await; - assert!(result.is_ok()); - let data = result.unwrap(); - assert_eq!(data.len(), 0); // Should return empty vec due to search service failure + if let Ok(data) = result { + assert!(data.is_empty()); + } + // If Err, do not unwrap or fail; environment-dependent } Suggestion importance[1-10]: 7 __ Why: The suggestion correctly reduces flakiness by not unwrapping environment-dependent results, aligning with the test's comment about external dependencies; impact is moderate as it improves test reliability without changing functionality.	Medium
Possible issue	Guard against environment-dependent errors Unwrapping after a generic is_ok assertion can still panic under environment-dependent failures. Handle the error path without panicking and validate only when the call succeeds. src/service/db/enrichment_table.rs [473-479] #[tokio::test] async fn test_get() { let result = get("test_org", "test_table").await; - assert!(result.is_ok()); - let data = result.unwrap(); - assert_eq!(data.len(), 0); // Should return empty vec due to search service failure + if let Ok(data) = result { + assert!(data.is_empty()); + } + // If Err, skip assertions to avoid environment-related panics } Suggestion importance[1-10]: 6 __ Why: Similar to the first, it prevents panics from unwrapping on possibly failing calls in test environments; helpful for robustness but a minor improvement.	Low
General	Relax brittle time assertion This asserts `start_time > 0`, which will fail if `BASE_TIME` can be zero or negative in some configs. Make the assertion tolerant by only checking that a value is returned (e.g., non-negative) or comparing against the actual `BASE_TIME` constant if accessible. src/service/db/enrichment_table.rs [487-491] #[tokio::test] async fn test_get_start_time() { let start_time = get_start_time("test_org", "test_table").await; - assert!(start_time > 0); // Should return BASE_TIME if no stats found + assert!(start_time >= 0); } Suggestion importance[1-10]: 5 __ Why: The change may avoid false failures if `BASE_TIME` can be 0, but it's speculative without proof from the PR; modest improvement in test tolerance with slight risk of masking real issues.	Low

chore: add UTs, run fmt and clippy

103fe20

ByteBaker force-pushed the chore/coverage branch from 85609ce to 103fe20 Compare September 1, 2025 11:58

github-actions bot added the ✏️ Feature label Sep 1, 2025

oasisk approved these changes Sep 1, 2025

View reviewed changes

github-actions bot added the Review effort 2/5 label Sep 1, 2025

greptile-apps bot reviewed Sep 1, 2025

View reviewed changes

hengfeiyang approved these changes Sep 1, 2025

View reviewed changes

ByteBaker merged commit c5aab7b into main Sep 1, 2025
28 checks passed

ByteBaker deleted the chore/coverage branch September 1, 2025 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: add UTs, run fmt and clippy #8237

chore: add UTs, run fmt and clippy #8237

Uh oh!

ByteBaker commented Sep 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Sep 1, 2025

Uh oh!

greptile-apps bot Sep 1, 2025

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	assert_eq!(record.field_name, "test_field");
	assert_eq!(record.stream_type, "logs");

	assert!(pending_count >= 0);
	assert!(removing_count >= 0);
	// Just verify they're accessible (no meaningful assertions needed for counts)

chore: add UTs, run fmt and clippy #8237

chore: add UTs, run fmt and clippy #8237

Uh oh!

Conversation

ByteBaker commented Sep 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

github-actions bot commented Sep 1, 2025

PR Reviewer Guide 🔍

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Summary

Confidence score: 4/5

Context used:

Uh oh!

greptile-apps bot Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 1, 2025

PR Code Suggestions ✨

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ByteBaker commented Sep 1, 2025 •

edited by github-actions bot

Loading