-
Notifications
You must be signed in to change notification settings - Fork 715
chore: Update canonnicalization step for scan_files_filtered to be more async #8364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…c instead of blocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR improves async performance by replacing synchronous Path::canonicalize() with asynchronous tokio::fs::canonicalize() in the scan_files_filtered function within src/config/src/utils/async_file.rs. The change addresses a performance bottleneck where blocking path canonicalization operations were negating the benefits of async file scanning.
The refactor separates the file collection phase from the canonicalization phase. Instead of canonicalizing paths synchronously during iteration, the function now:
- Collects all uncanonicalized paths first into a
Vec<PathBuf> - Performs all canonicalization operations concurrently using
futures::future::join_all()withtokio::fs::canonicalize() - Processes the results, filtering out any canonicalization failures
This approach maintains the same functionality while being more efficient in async contexts by avoiding thread pool blocking operations and leveraging concurrent processing of canonicalization tasks. The change fits well with the codebase's async-first architecture and improves resource utilization during file system operations.
Confidence score: 3/5
- This PR introduces a potential issue with silent error handling that could lead to incomplete results without indication
- Score reflects the performance improvement benefits but concerns about error handling robustness in a critical file operation
- Pay close attention to the error handling logic in
async_file.rsaround lines 213-217
1 file reviewed, no comments
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
|
@yaga-simha Please also move |
PR Code Suggestions ✨Explore these optional code suggestions:
|
I thought about this before sending the PR. I agree with you. But this helper should belong in a general purpose file_path utility as this is not an async specific helper. In fact, it has more to do with wal paths than even utils/files too. Please suggest a better place. |
503d07c to
c99240b
Compare
c99240b to
fb169f8
Compare
User description
PR type
Chore / Code improvement
Description
We are addressing Greptile's complaint about Path::canonnicalize negating the async benefits and suggested tokio::fs::cannonicalize. Now all path canonnicalizations are performed asynchronously.
create_wal_dir_datetime_filter is now refactored into async_file.rs where its companion function scan_file_filtered() exists.
PR Type
Enhancement
Description
Switch to async canonicalization with Tokio
Defer canonicalization after filtering files
Batch resolve paths via join_all
Preserve limit before async canonicalization
Diagram Walkthrough
File Walkthrough
async_file.rs
Replace blocking canonicalize with async batched canonicalizesrc/config/src/utils/async_file.rs
PathBuf::canonicalizein walker.futures::future::join_allwithtokio::fs::canonicalize.