Add support for multiple document paths with glob patterns#2
Conversation
This enhancement allows users to specify multiple document directories
and glob patterns in config.json, providing more flexibility in
organizing and indexing markdown files.
Features:
- Config: Changed from single 'documents_dir' to 'document_patterns' array
- Supports both directory paths and glob patterns (e.g., "./docs/**/*.md")
- Backwards compatible: old 'documents_dir' automatically migrated
- Sync: Uses pattern matching to find files instead of single directory walk
- MCP tools: Updated path validation to work with multiple base directories
- Tests: Added comprehensive tests for pattern expansion and migration
- Documentation: Updated README with examples and pattern syntax
Example config.json:
{
"document_patterns": [
"./documents",
"./notes/**/*.md",
"./projects/backend/**/*.md"
],
...
}
Pattern examples:
- "./documents" - All .md files in directory
- "./docs/**/*.md" - Recursive search with **
- "./projects/*/docs/*.md" - Wildcard patterns
- "/path/to/external/docs" - Absolute paths
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||||||
tomohiro-owada
left a comment
There was a problem hiding this comment.
Hi, thank you so much for this PR! The implementation of multiple document paths with glob patterns is really well done.
I made a small follow-up commit to fix a couple of minor issues:
- Removed unused import: path/filepath was imported but not used in internal/indexer/sync.go
- Fixed integration tests: TestEndToEnd_Sync and TestEndToEnd_EmptyDirectory were still using the deprecated cfg.DocumentsDir instead of the new
cfg.DocumentPatterns, which caused them to read from the actual ./documents directory rather than the test's temp directory
With these fixes, all tests are now passing. Great work on the backward compatibility migration and the glob pattern expansion logic!
User description
This enhancement allows users to specify multiple document directories and glob patterns in config.json, providing more flexibility in organizing and indexing markdown files.
Features:
Example config.json:
{
"document_patterns": [ "./documents", "./notes//*.md", "./projects/backend//*.md" ], ... }
Pattern examples:
PR Type
Enhancement
Description
Replace single
documents_dirwithdocument_patternsarray for flexible multi-location indexingSupport glob patterns including
**for recursive directory matchingImplement backwards compatibility with automatic migration from old config format
Add comprehensive pattern expansion logic with deduplication across multiple patterns
Update path validation in MCP tools to work with multiple base directories
Enhance test coverage with pattern expansion and migration scenarios
Diagram Walkthrough
File Walkthrough
main.go
Update main to support multiple document directoriescmd/main.go
DocumentPatternsarray instead of singleDocumentsDirdirectories from
GetBaseDirectories()failures
config.go
Add glob pattern support with backwards compatibilityinternal/config/config.go
DocumentPatternsfield as array and deprecatedDocumentsDirfield
GetDocumentFiles()to expand all patterns and returndeduplicated markdown files
expandPattern()to handle both directory paths and glob patternsexpandDoubleStarPattern()for recursive**patternmatching
GetBaseDirectories()to extract base directories from patternsfor path validation
Load()functionValidate()to ensure at least one document pattern isconfigured
sync.go
Use pattern-based file discovery instead of directory walkinternal/indexer/sync.go
filepath.Walk()on single directory withGetDocumentFiles()call
config
access errors
tools.go
Update path validation for multiple base directoriesinternal/mcp/tools.go
validatePath()to accept multiple base directories arraybase directory
handleDeleteDocument()andhandleReindexDocument()to use fullfile paths
handleIndexMarkdown(),handleAddFrontmatter(), andhandleUpdateFrontmatter()to useGetBaseDirectories()config_test.go
Add comprehensive tests for pattern expansion and migrationinternal/config/config_test.go
DocumentPatternsarray instead ofDocumentsDirTestLoadConfig_BackwardsCompatibility()to verify migration fromold format
TestGetDocumentFiles()with multiple test cases for patternexpansion
TestGetBaseDirectories()to validate base directory extractioncontains()andcontainsMiddle()for flexiblestring matching
README.md
Document glob pattern support and configuration examplesREADME.md
document_patternsarray withmultiple patterns
glob patterns
documents_dirfield