improved file indexing logic for faster results#30
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the file indexing implementation to traverse directories in parallel (with inline pattern matching) to improve indexing speed, and adds a test that exercises parallel traversal behavior.
Changes:
- Reworked
BuildIndexto remove the worker/channel pipeline and perform inline matching during traversal. - Added parallel subdirectory traversal using goroutines gated by a semaphore.
- Added a new test to validate results on a wide directory tree.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
pkg/fileindex/fileindex.go |
Replaces worker-based processing with inline matching and parallel directory walkers. |
pkg/fileindex/fileindex_test.go |
Adds a test that builds a wide directory tree and asserts all expected .env files are indexed. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
This PR refactors the file indexing implementation to improve indexing speed by parallelizing directory traversal and doing inline pattern matching during the walk, and adds a test to exercise the new traversal strategy.
Changes:
- Replace the worker-pool + channel pipeline with goroutine-per-subdirectory traversal gated by a semaphore.
- Perform pattern matching inline during traversal rather than dispatching discovered files to workers.
- Add a new test that builds a wide directory tree and asserts the index finds all expected
.envfiles.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pkg/fileindex/fileindex.go | Refactors BuildIndex to parallelize traversal with a semaphore and match files inline (removes errgroup/worker pipeline). |
| pkg/fileindex/fileindex_test.go | Adds a parallel traversal test that builds a wide directory structure and validates expected matches. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
This PR updates the file indexing implementation to perform inline matching during directory traversal and introduces a new test aimed at exercising parallel traversal behavior.
Changes:
- Replaced the worker/channel +
errgroupindexing pipeline with inline matching during recursive traversal. - Added goroutine-per-subdirectory traversal gated by a semaphore, plus a new test for parallel traversal.
- Updated the context-cancellation test expectations to align with the new return behavior (index returned even on cancellation).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
pkg/fileindex/fileindex.go |
Switches indexing to inline matching and introduces semaphore-gated parallel directory walking; updates cancellation/error behavior. |
pkg/fileindex/fileindex_test.go |
Adjusts cancellation test assertions and adds a new parallel traversal test case. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
This PR updates the fileindex package to speed up indexing by removing the worker/channel pipeline and instead performing pattern matching inline while traversing directories in parallel (goroutine-per-subdirectory with a semaphore). It also expands test coverage to exercise the new traversal strategy.
Changes:
- Replace errgroup + worker pool indexing with inline matching during directory traversal.
- Add semaphore-gated parallel subdirectory traversal (including for symlinked directories when enabled).
- Update/add tests for cancellation behavior and parallel traversal coverage.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
pkg/fileindex/fileindex.go |
Reworks indexing to inline matching and introduces semaphore-gated parallel directory walking. |
pkg/fileindex/fileindex_test.go |
Adds a parallel traversal test and adjusts cancellation assertions to align with new behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
This pull request refactors the file indexing logic in
pkg/fileindex/fileindex.goto improve concurrency, simplify the code, and enable inline file matching during directory traversal. The worker-based pattern is replaced with a goroutine-per-subdirectory traversal (GDU-style), resulting in more efficient and scalable directory scanning. Several related tests have also been updated and expanded.Concurrency and traversal refactor:
runWorkerandfileEntrystructures were removed, and file matching is now performed directly within the traversal logic. ([[1]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL101-L106),[[2]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL163-R166),[[3]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL205-R234),[[4]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL301-R326))walkDirectoryfunction was rewritten to traverse directories in parallel, match files inline, and handle cancellation responsively. Subdirectories are now processed by spawning goroutines gated by a semaphore, avoiding deadlocks and maximizing CPU utilization. ([[1]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL205-R234),[[2]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL272-R256),[[3]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL301-R326))Simplification and cleanup:
errgroupand simplified error handling and progress reporting, making cancellation and completion logic more straightforward. ([[1]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL18),[[2]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL127-L138),[[3]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL163-R166))BuildIndexandrunDiscoverynow perform inline matching. ([[1]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL116-R109),[[2]](https://github.com/boostsecurityio/bagel/pull/30/files#diff-e6bfd96a585bbbdffc4b4e1d19ead84777166e2e2a29d4b888cb35293dcdb59eL205-R234))Testing improvements:
TestBuildIndex_ParallelTraversalto verify that the new goroutine-per-subdirectory traversal correctly finds all files in a wide directory tree, and updated cancellation tests to ensure proper handling of context cancellation. ([pkg/fileindex/fileindex_test.goL565-R615](https://github.com/boostsecurityio/bagel/pull/30/files#diff-5c2276b5dbb6e0f6c0161f7738778a610d4c46c60467efa6c97f30ccbdde79b1L565-R615))These changes result in a more efficient, scalable, and maintainable file indexing implementation.