-
Notifications
You must be signed in to change notification settings - Fork 711
perf: use buffer write for wal #4538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes involve modifications to various files, primarily focusing on the configuration of write-ahead logging (WAL) settings, enhancements to the Changes
Possibly related PRs
Suggested labels
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (1)
src/service/ingestion/mod.rs (1)
441-447: Effective error handling for asynchronous processing!The changes introduce error handling for the asynchronous processing by using
try_join_allto wait for all tasks to complete and collect their results.If all tasks complete successfully, their results (tuples of record count and size) are returned. In case of an error in any task, the error is logged using
log::error!, and an empty vector is returned instead of the task results.Consider using a more specific error message in the
log::error!statement to indicate that the error occurred during the collection of task results. For example:log::error!("Error collecting task results: {}", e);
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (9)
- src/config/src/config.rs (2 hunks)
- src/ingester/src/partition.rs (1 hunks)
- src/ingester/src/writer.rs (4 hunks)
- src/service/ingestion/mod.rs (2 hunks)
- src/service/logs/otlp_grpc.rs (1 hunks)
- src/service/logs/otlp_http.rs (2 hunks)
- src/wal/benches/write.rs (1 hunks)
- src/wal/src/writer.rs (7 hunks)
- src/wal/tests/wal.rs (1 hunks)
Files skipped from review due to trivial changes (2)
- src/service/logs/otlp_grpc.rs
- src/service/logs/otlp_http.rs
Additional context used
Path-based instructions (7)
src/wal/tests/wal.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/wal/benches/write.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/wal/src/writer.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/ingester/src/partition.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/ingester/src/writer.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/service/ingestion/mod.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
src/config/src/config.rs (1)
Pattern
**/*.rs: You are a smart rustlang pull request reviewer.
You are going to review all the rustlang files.
Be concise, and add a brief explanation to your suggestionsMake sure the responses are not too verbose and keep the suggestions to the point i.e. actionable.
Additional comments not posted (20)
src/wal/tests/wal.rs (1)
24-24: LGTM!The addition of the buffer size parameter to the
Writerconstructor is a good change that can potentially improve the performance of write operations by optimizing memory usage and handling larger data chunks.The chosen buffer size of 8 KB seems reasonable for most use cases.
src/wal/benches/write.rs (4)
24-32: LGTM!The nested loop structure is a good approach to benchmark the
Writerwith different buffer sizes and entry sizes. This will provide more comprehensive performance insights.The
BenchmarkIdis also updated correctly to reflect both parameters.
27-27: LGTM!The
Writer::newfunction call is updated correctly to include thebuf_sizeparameter.
28-28: LGTM!The
datavector is created correctly with the size specified byentry_size.
30-30: LGTM!The
writer.writefunction is called correctly with thedatavector andtrueas arguments.src/wal/src/writer.rs (6)
18-18: LGTM!The addition of
BufWriterto the list of imported items is necessary to support the changes made to theWriterstruct.
30-30: LGTM!The modification of the
Writerstruct to useBufWriter<File>instead ofFileis in line with the objective of using a buffered writer to improve write performance.
43-43: LGTM!The addition of the
buffer_sizeparameter to thenewmethod allows the user to specify the capacity of theBufWriter, which is a good addition.
74-74: LGTM!The initialization of the
ffield with aBufWriterusing the providedbuffer_sizeis in line with the objective of using a buffered writer to improve write performance.
128-128: LGTM!The changes made to the
syncandclosemethods are necessary to accommodate the new buffered writer and ensure the integrity of the data being written. The use ofstd::io::Cursorinstead ofio::Cursorclarifies the source of theCursortype and does not affect functionality.Also applies to: 153-157, 163-165
192-199: LGTM!The implementation of the
Writetrait for theHasherWrapperstruct is necessary for it to be used as a decorator for theFrameEncoder. Thewriteandflushmethods are implemented correctly.src/ingester/src/partition.rs (1)
211-211: LGTM!The explicit
drop(f);statement is a good addition to ensure proper resource management. It releases the file handle immediately after writing the data, preventing potential resource leaks and file locking issues.src/ingester/src/writer.rs (3)
161-161: LGTM!Adding the
wal_write_buffer_sizeparameter enhances the configurability of the WAL and potentially improves write performance by allowing the buffer size to be tuned.
Line range hint
219-220: Changes look good!Making the
walvariable mutable is necessary if the state of the WAL needs to be modified in theclosemethod.Adding the
wal_write_buffer_sizeparameter here maintains consistency with the change made in theWriter::newmethod.
275-275: LGTM!Making the
walvariable mutable is consistent with the change made in theclosemethod and suggests that some state of the WAL needs to be modified in thesyncmethod as well.src/service/ingestion/mod.rs (3)
407-409: LGTM!The changes introduce the necessary variables for the asynchronous processing:
cfgto store the configuration.tasksto store the spawned tasks.semaphoreto limit concurrent writes based on thecpu_numconfiguration.
414-438: Excellent use of asynchronous processing and semaphore!The changes introduce asynchronous processing by spawning a new task for each non-empty entry in
buf. The tasks are stored in thetasksvector for later collection.Each task acquires a permit from the semaphore before writing the entry, which effectively limits the number of concurrent writes based on the
cpu_numconfiguration. The permit is dropped after the write operation to allow other tasks to acquire it.The task returns a tuple of the number of records and the size of the records for statistics.
448-456: LGTM!The changes process the task results returned by
try_join_alland update the request statistics (req_stats) accordingly.For each successful task result (tuple of record count and size):
- The
sizefield ofreq_statsis incremented by the size of the records divided bySIZE_IN_MB.- The
recordsfield ofreq_statsis incremented by the number of records.If a task returned an error, the error is logged using
log::error!.src/config/src/config.rs (2)
826-827: LGTM!The new configuration field
wal_write_buffer_sizewith a default value of 16 KB looks good.
1628-1632: Good validation check!The validation to ensure
wal_write_buffer_sizeis not set below 4 KB is a good safeguard to prevent performance issues due to a very small write buffer.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Writerstruct to utilize a buffered writer for more efficient file operations.Tests
Writerunder various configurations.