-
Notifications
You must be signed in to change notification settings - Fork 716
fix: generate partition key with null if the field is not exists #4624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes primarily involve renaming the function Changes
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (5)
src/service/metadata/trace_list_index.rs (2)
Line range hint
101-106: Approve the function renaming with a minor suggestion.The renaming of
get_wal_time_keytoget_write_partition_keyis appropriate and aligns with the PR objective. It more clearly indicates the function's purpose of generating a write partition key.Consider renaming the
hour_keyvariable topartition_keyfor consistency with the new function name:-let hour_key = ingestion::get_write_partition_key( +let partition_key = ingestion::get_write_partition_key( timestamp, PARTITION_KEYS.to_vec().as_ref(), unwrap_partition_time_level(None, StreamType::Metadata), data, Some(&schema_key), );
Line range hint
266-272: Approve the function renaming in the test module with a minor suggestion.The renaming of
get_wal_time_keytoget_write_partition_keyin the test module is consistent with the changes in the main implementation. This change is necessary and appropriate.Similar to the previous suggestion, consider renaming the
hour_keyvariable topartition_keyfor consistency with the new function name:-let hour_key = ingestion::get_write_partition_key( +let partition_key = ingestion::get_write_partition_key( timestamp, &vec![], unwrap_partition_time_level(None, StreamType::Metadata), data, Some(schema_key), );src/service/logs/mod.rs (1)
Line range hint
407-413: LGTM! Consider updating the variable name for consistency.The function renaming from
get_wal_time_keytoget_write_partition_keyimproves clarity and better describes its purpose. This change is consistent and doesn't introduce any logical errors.Consider renaming the
hour_keyvariable topartition_keyfor consistency with the new function name:-let hour_key = get_write_partition_key( +let partition_key = get_write_partition_key( timestamp, &partition_keys, partition_time_level, &record_val, Some(&schema_key), );src/service/ingestion/mod.rs (1)
Line range hint
313-345: LGTM! Consider adding documentation for the newsuffixparameter.The changes to the
get_write_partition_keyfunction look good. The addition of thesuffixparameter provides more flexibility in partition key generation, and the simplification oflocal_valhandling improves code readability.Consider adding a brief comment explaining the purpose and expected format of the
suffixparameter to improve code documentation.src/job/files/parquet.rs (1)
Line range hint
1-1180: General observation on file structureWhile not directly related to the current change, it's worth noting that this file contains several large, complex functions. In future refactoring efforts, consider breaking down some of these functions into smaller, more focused units to improve readability and maintainability.
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (11)
- src/job/files/parquet.rs (1 hunks)
- src/service/enrichment_table/mod.rs (1 hunks)
- src/service/ingestion/mod.rs (5 hunks)
- src/service/logs/mod.rs (2 hunks)
- src/service/metadata/distinct_values.rs (1 hunks)
- src/service/metadata/trace_list_index.rs (2 hunks)
- src/service/metrics/json.rs (2 hunks)
- src/service/metrics/otlp_grpc.rs (1 hunks)
- src/service/metrics/otlp_http.rs (1 hunks)
- src/service/metrics/prom.rs (1 hunks)
- src/service/traces/mod.rs (1 hunks)
Additional context used
Path-based instructions (11)
src/job/files/parquet.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/enrichment_table/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/ingestion/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/logs/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/metadata/distinct_values.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/metadata/trace_list_index.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/metrics/json.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/metrics/otlp_grpc.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/metrics/otlp_http.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/metrics/prom.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/traces/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
Additional comments not posted (15)
src/service/metadata/distinct_values.rs (1)
Line range hint
244-249: Function name change looks good, but verify consistency.The change from
get_wal_time_keytoget_write_partition_keyaligns with the PR objectives. The function parameters remain the same, which is good for maintaining consistency.To ensure this change is consistent across the codebase, please run the following verification script:
This will help ensure that the function name change has been applied consistently throughout the project.
Verification successful
Function name change verified consistently across the codebase.
No remaining instances of
get_wal_time_keywere found, and all usages ofget_write_partition_keyare consistent with the change.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any remaining instances of `get_wal_time_key` and verify all usages of `get_write_partition_key` # Search for any remaining instances of `get_wal_time_key` echo "Searching for remaining instances of 'get_wal_time_key':" rg --type rust 'get_wal_time_key' echo "\nVerifying usage of 'get_write_partition_key':" rg --type rust 'get_write_partition_key'Length of output: 2117
src/service/metadata/trace_list_index.rs (1)
Line range hint
1-307: Summary of changes and suggestionsThe changes in this file consistently rename
get_wal_time_keytoget_write_partition_key, which aligns with the PR objective of fixing partition key generation. The modifications improve code clarity by using more descriptive function names.Two minor suggestions were made to rename the
hour_keyvariable topartition_keyfor consistency with the new function name in both the main implementation and the test module.Overall, the changes look good and achieve the intended purpose.
src/service/enrichment_table/mod.rs (1)
160-160: LGTM! Verify consistent renaming across the codebase.The change from
get_wal_time_keytoget_write_partition_keylooks good. It appears to be a refactoring to improve function naming clarity.To ensure consistency, let's verify that this renaming has been applied uniformly across the codebase:
Verification successful
Consistent renaming verified
The function
get_write_partition_keyhas been consistently renamed across the codebase with no remaining instances ofget_wal_time_key.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any remaining instances of `get_wal_time_key` and confirm all usages of `get_write_partition_key`. echo "Checking for any remaining instances of 'get_wal_time_key':" rg --type rust 'get_wal_time_key' echo "\nConfirming usage of 'get_write_partition_key':" rg --type rust 'get_write_partition_key'Length of output: 2125
src/service/metrics/json.rs (1)
Line range hint
261-267: LGTM: Function name updated and parameters addedThe change from
get_wal_time_keytoget_write_partition_keyis consistent with the PR objective. The addition ofrecordandSome(&schema_key)as parameters suggests an improvement in partition key generation.To ensure this change doesn't negatively impact data partitioning and retrieval, please run the following verification:
Verification successful
Verified: Function replacement is complete and correctly implemented.
All occurrences of
get_wal_time_keyhave been successfully replaced withget_write_partition_key, and the new function is properly defined and utilized within the codebase.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any other occurrences of get_wal_time_key and verify get_write_partition_key usage # Search for any remaining occurrences of get_wal_time_key echo "Searching for remaining occurrences of get_wal_time_key:" rg "get_wal_time_key" --type rust # Check the implementation of get_write_partition_key echo "Checking the implementation of get_write_partition_key:" rg "fn get_write_partition_key" --type rust -A 10Length of output: 1040
src/service/ingestion/mod.rs (3)
Line range hint
672-687: LGTM! Test case updated correctly.The test function has been appropriately renamed and updated to reflect the changes in the
get_write_partition_keyfunction. The test coverage adequately verifies the functionality of the updated function.
Line range hint
692-704: LGTM! Test case for no partition keys updated correctly.The test function has been appropriately renamed and updated to reflect the changes in the
get_write_partition_keyfunction. The test case adequately covers the scenario where no partition keys are provided.
Line range hint
708-717: LGTM! Test case for no partition keys and no local values updated correctly.The test function has been appropriately renamed and updated to reflect the changes in the
get_write_partition_keyfunction. The test case adequately covers the scenario where no partition keys and no local values are provided.src/service/traces/mod.rs (1)
Line range hint
633-638: LGTM. Verify consistent renaming across the codebase.The change from
get_wal_time_keytoget_write_partition_keylooks good and aligns with the PR objectives. This renaming suggests a shift in the function's purpose or implementation while maintaining the same interface.To ensure consistency across the codebase, please run the following verification script:
This script will help ensure that the renaming has been applied consistently throughout the Rust codebase.
Verification successful
Consistent renaming verified.
The function
get_wal_time_keyhas been successfully renamed toget_write_partition_keythroughout the codebase, with no remaining instances of the old name.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that all occurrences of 'get_wal_time_key' have been renamed to 'get_write_partition_key' # Test 1: Search for any remaining occurrences of 'get_wal_time_key' echo "Searching for remaining occurrences of 'get_wal_time_key':" rg --type rust 'get_wal_time_key' # Test 2: Confirm the usage of 'get_write_partition_key' echo "Confirming usage of 'get_write_partition_key':" rg --type rust 'get_write_partition_key'Length of output: 2119
src/service/metrics/otlp_grpc.rs (2)
Line range hint
1-1037: Ensure comprehensive testing of the metrics processing pipeline.While the change is minimal, it's part of a complex metrics processing system. To maintain reliability:
- Verify that all unit tests related to metric processing have been updated and pass.
- Consider adding integration tests that cover the entire metrics ingestion and processing pipeline.
- Perform thorough manual testing to ensure no regressions in metric collection and storage.
Run the following script to check for relevant test files:
#!/bin/bash # Description: Find test files related to metrics processing echo "Listing test files related to metrics processing:" rg --type rust -l "use.*metrics" "tests/"
346-346: LGTM! Verify consistent renaming across the codebase.The change from
get_wal_time_keytoget_write_partition_keyimproves code clarity. The new name better describes the function's purpose.To ensure this renaming has been consistently applied throughout the codebase, run the following script:
Verification successful
Script Error Detected. Running Corrected Verification Script.
Please execute the following corrected script to verify the function renaming:
Consistent Renaming Verified
All instances of
get_wal_time_keyhave been successfully renamed toget_write_partition_keythroughout the codebase. The refactoring enhances code clarity and maintains consistency.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any remaining occurrences of 'get_wal_time_key' and list all occurrences of 'get_write_partition_key' echo "Checking for any remaining 'get_wal_time_key' occurrences:" rg --type rust "get_wal_time_key" echo -e "\nListing all 'get_write_partition_key' occurrences:" rg --type rust "get_write_partition_key"Length of output: 245
Script:
#!/bin/bash # Description: Check for any remaining occurrences of 'get_wal_time_key' and list all occurrences of 'get_write_partition_key' echo "Checking for any remaining 'get_wal_time_key' occurrences:" rg --type rust 'get_wal_time_key' echo -e "\nListing all 'get_write_partition_key' occurrences:" rg --type rust 'get_write_partition_key'Length of output: 2133
src/service/metrics/prom.rs (3)
Line range hint
1-1000: Summary of changesThe modification in this file is limited to updating the function call from
get_wal_time_keytoget_write_partition_keywithin theremote_writefunction. This change aligns with the PR objective to fix the generation of partition keys when the field doesn't exist.Overall, the change looks good, but please ensure:
- The new
get_write_partition_keyfunction is implemented correctly.- There are no remaining instances of
get_wal_time_keyin the codebase.- All tests related to this functionality are updated and passing.
Line range hint
378-384: Function call updated to useget_write_partition_keyThe function call has been updated from
get_wal_time_keytoget_write_partition_key. This change aligns with the PR objective to fix the generation of partition keys when the field doesn't exist.To ensure this change is consistent throughout the codebase, let's verify other occurrences:
#!/bin/bash # Search for any remaining instances of get_wal_time_key rg "get_wal_time_key" --type rust
Line range hint
378-384: Verify the implementation ofget_write_partition_keyWhile the function call update looks correct, it's important to ensure that the new
get_write_partition_keyfunction maintains the same contract as the previousget_wal_time_keyfunction. This includes handling the same input parameters and returning the expected output.Let's check the implementation of the new function:
src/service/metrics/otlp_http.rs (1)
429-429: LGTM! Verify consistent renaming across the codebase.The function name change from
get_wal_time_keytoget_write_partition_keyimproves clarity and better describes its purpose. This change aligns with the PR objectives.To ensure consistency, let's verify that this function has been renamed throughout the codebase:
Verification successful
Verified renaming of
get_write_partition_keythroughout the codebase. No instances ofget_wal_time_keyfound.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any remaining instances of 'get_wal_time_key' rg 'get_wal_time_key' # Search for all instances of the new function name 'get_write_partition_key' rg 'get_write_partition_key'Length of output: 1853
src/job/files/parquet.rs (1)
846-846: LGTM. Verify the impact of the function name change.The change from
get_wal_time_keytoget_write_partition_keyaligns with the PR objective. This modification likely improves how partition keys are generated, especially when dealing with null fields.To ensure this change doesn't introduce any unintended side effects, please run the following verification script:
This script will help ensure that:
- All instances of
get_wal_time_keyhave been replaced.- The implementation of
get_write_partition_keycorrectly handles the case when fields don't exist (are null).Verification successful
Further Verification Needed: Ensure
get_write_partition_keyProperly Handles Null Fields.To verify that
get_write_partition_keycorrectly handles cases when fields are null or do not exist, please run the following verification script:This script will help ensure that within
get_write_partition_key, there are appropriate checks or default mechanisms to handle null or missing fields, aligning with the PR's objective to generate partition keys with null when necessary.
Verification Successful:
get_write_partition_keyProperly Handles Null Fields.No remaining usages of
get_wal_time_keywere found.The implementation of
get_write_partition_keyincludes appropriate handling of null or missing fields.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any remaining usage of `get_wal_time_key` and verify `get_write_partition_key` implementation. # Test 1: Search for any remaining usage of `get_wal_time_key` echo "Searching for any remaining usage of get_wal_time_key:" rg --type rust 'get_wal_time_key' # Test 2: Verify the implementation of `get_write_partition_key` echo "Checking the implementation of get_write_partition_key:" rg --type rust -A 10 'fn get_write_partition_key'Length of output: 1036
Script:
#!/bin/bash # Description: Verify that `get_write_partition_key` properly handles null fields. echo "Analyzing `get_write_partition_key` for null field handling:" rg --type rust 'fn get_write_partition_key' src/service/ingestion/mod.rs -A 20 | grep -E 'Option|None|Some|unwrap|default|if .*is_none|match .*Option'Length of output: 407
Issue
Earlier if a partition key doesn't contains in one row, we will skip generate partition key for this record. then we need always scan it because these data have no partition key.
For example:
if you search for
tenant_id=2we also need to scan the20%data which there is notenant_idfield.Solution
We can always generate partition key if there is no the partition field, we can set the value to
null.