-
Notifications
You must be signed in to change notification settings - Fork 711
fix: merge plan panic #4209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: merge plan panic #4209
Conversation
WalkthroughThe recent changes enhance error handling, response validation, and schema management across multiple components of the system. Key updates include improved logging for response statuses, restructuring ingestion logic to avoid unnecessary processing, and transitioning to Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server
participant Database
Client->>Server: Send Ingestion Request
Server->>Database: Save Enrichment Data
alt Success
Database-->>Server: OK Response
Server-->>Client: Success Response
else Failure
Database-->>Server: Error Response
Server-->>Client: Error Response
end
sequenceDiagram
participant Scheduler
participant Ingestion Service
Scheduler->>Ingestion Service: Check local_val
alt Empty local_val
Scheduler-->>Scheduler: Log "No records to ingest"
else Non-empty local_val
Scheduler->>Ingestion Service: Ingest Data
Ingestion Service-->>Scheduler: Ingestion Result
end
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Outside diff range, codebase verification and nitpick comments (2)
src/service/alerts/scheduler.rs (1)
577-631: Ensure robust error handling during ingestion.The ingestion process is well-structured with appropriate logging and error handling. Ensure that the error messages provide sufficient context for debugging.
src/service/search/datafusion/exec.rs (1)
258-289: Handle potential errors intry_mergeoperation.In the
merge_partitionsfunction, the use ofSchema::try_mergeis followed by anunwrap()call, which can lead to runtime panics if schema conflicts occur. It is advisable to handle these potential errors gracefully to avoid unexpected crashes.
- File:
src/service/search/datafusion/exec.rs- Line: 289
Consider implementing error handling to manage schema conflicts more robustly.
Analysis chain
Ensure schema consistency before formatting.
The logic for checking schema consistency and determining the need for formatting is well-structured. However, ensure that the
try_mergeoperation handles all potential schema conflicts gracefully.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that the `try_merge` operation in `merge_partitions` handles schema conflicts gracefully. # Test: Search for `try_merge` usage. Expect: Proper error handling after `try_merge`. rg --type rust --context 5 'Schema::try_merge'Length of output: 3096
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (9)
- src/handler/grpc/request/ingest.rs (2 hunks)
- src/service/alerts/scheduler.rs (1 hunks)
- src/service/enrichment_table/mod.rs (2 hunks)
- src/service/search/cluster/mod.rs (3 hunks)
- src/service/search/datafusion/exec.rs (4 hunks)
- src/service/search/grpc/mod.rs (2 hunks)
- src/service/search/grpc/storage.rs (2 hunks)
- src/service/search/grpc/wal.rs (3 hunks)
- src/service/search/mod.rs (3 hunks)
Additional context used
Path-based instructions (9)
src/handler/grpc/request/ingest.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/search/grpc/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/enrichment_table/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/search/grpc/storage.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/search/grpc/wal.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/alerts/scheduler.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/search/datafusion/exec.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/search/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/search/cluster/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
Additional comments not posted (13)
src/handler/grpc/request/ingest.rs (2)
16-16: Good use ofStatusCodefor HTTP response handling.The addition of
StatusCodeimproves the clarity and maintainability of the code by making HTTP response handling more explicit.
68-96: Enhanced error logging is beneficial.The detailed error logging with status codes and response bodies provides better insights during debugging. This is a positive change that enhances the robustness of the error handling mechanism.
src/service/enrichment_table/mod.rs (1)
175-178: Verify the new behavior for emptyrecords.The change from a
BadRequestto anOkresponse for emptyrecordssuggests a shift in logic. Ensure that this new behavior aligns with the overall system requirements and does not introduce unintended side effects.src/service/search/grpc/storage.rs (2)
231-231: Good use ofArcfor concurrent access.The wrapping of
schemain anArcensures safe shared ownership across threads, which is crucial for concurrent environments. This change is well-implemented.
249-254: Ensure consistent use ofArcwhen passingschema.The
schemais cloned and passed to functions, which is the correct approach when usingArc. Ensure that this pattern is consistently applied whereverschemais used in similar contexts.Also applies to: 266-271
Verification successful
Consistent use of
Arcwhen passingschemaverified.The
schemais consistently passed usingschema.clone()in function calls togenerate_search_schemaandgenerate_select_start_search_schema, which aligns with the expected usage ofArc. No issues found.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify consistent use of `Arc` when passing `schema`. # Test: Search for `generate_select_start_search_schema` and `generate_search_schema` calls. # Expect: `schema` should be passed as `Arc::clone(&schema)`. rg --type rust 'generate_(select_start_search_schema|search_schema)\s*\(' -A 5Length of output: 4110
src/service/search/grpc/wal.rs (3)
248-248: Good use ofArcfor concurrent access insearch_parquet.The wrapping of
schemain anArcensures safe shared ownership across threads, which is crucial for concurrent environments. This change is well-implemented.
266-271: Ensure consistent use ofArcwhen passingschemainsearch_parquet.The
schemais cloned and passed to functions, which is the correct approach when usingArc. Ensure that this pattern is consistently applied whereverschemais used in similar contexts.
473-478: Good use ofArcfor concurrent access insearch_memtable.The wrapping of
schemain anArcensures safe shared ownership across threads, which is crucial for concurrent environments. This change is well-implemented.src/service/alerts/scheduler.rs (1)
568-576: Efficient handling of emptylocal_val.The added check for an empty
local_valbefore attempting ingestion enhances efficiency by avoiding unnecessary operations. The logging and status update are appropriately handled.src/service/search/datafusion/exec.rs (1)
304-316: Repartitioning logic is clear and efficient.The repartitioning logic is straightforward and efficiently manages batch distribution based on chunk size.
src/service/search/mod.rs (2)
743-744: AdoptArc<Schema>for better memory management.The change to use
Arc<Schema>enhances memory management by allowing shared ownership, which is beneficial in concurrent contexts.
791-792: UtilizeArc<Schema>for improved concurrency.The change to use
Arc<Schema>is appropriate and improves efficiency in concurrent scenarios by enabling shared ownership of the schema.src/service/search/cluster/mod.rs (1)
713-731: Refactor schema generation for modularity.The refactoring to use
generate_select_start_search_schemaandgenerate_search_schemaimproves modularity and maintainability by encapsulating schema logic.
merge plan panic
The reason is in some case the plan is difference. if there is only one partition and that is a simple query then there is no
sortand simple add a newProjectionplan, and it is difference with other nodes which have multiple partitions, then the merge got RecordBatch mismatch.The solution have two:
EmptyTableto generate plan then we can guarantee the plan is same, then replace theEmptyTabletoMemoryTableorParquetTablein different node based on the data.By the way:
Also updated a logic of derived schema