-
Notifications
You must be signed in to change notification settings - Fork 715
feat: adding built-in regex patterns #8827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PR Reviewer Guide 🔍(Review updated until commit 1a868b7)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 1a868b7
Previous suggestionsSuggestions up to commit 96f7e5f
|
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 364 | 344 | 0 | 19 | 1 | 95% | 4m 38s |
|
Persistent review updated to latest commit 1a868b7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Summary
This PR introduces a built-in regex patterns feature that fetches curated patterns from GitHub (PyWhat project by default) with caching, search, and filtering capabilities.
Key Changes
- GitHub Data Service: New generic service (
src/service/github/) for fetching data from GitHub with caching (1-hour TTL, 100MB max), retry logic (3 attempts with exponential backoff), and rate-limit handling - Pattern Adapter: Flexible
PyWhatAdaptersupports multiple JSON formats (PyWhat, custom, minimal) with field aliases for compatibility - Built-in Patterns API: New
GET /api/{org}/re_patterns/built-inendpoint with search, tag filtering, and force refresh options - Duplicate Name Handling: Pattern creation now auto-resolves duplicate names by appending numeric suffixes ("Pattern (2)", "Pattern (3)", etc.) with max 100 attempts
- UI Components: New
BuiltInPatternsTab.vuewith search, tag filtering, preview dialog, and batch import capabilities
Architecture Integration
The implementation follows a clean layered architecture: UI → API → Adapter → Service → Cache → External Source. The GitHub service is generic and reusable for future GitHub-based data fetching needs. The caching layer prevents excessive API calls and respects GitHub rate limits.
Issues Found
- Critical:
client.rs:47usesexpect()which can panic during HTTP client creation (violates custom rule b18dc58c)
Confidence Score: 4/5
- This PR is safe to merge after fixing the expect() panic issue
- The implementation is well-structured with comprehensive error handling, caching, and tests. The only critical issue is the use of
expect()in client.rs line 47 which violates custom rules and can cause panics. The duplicate name handling is clever, the caching layer is robust, and the frontend integration is clean. All copyright headers are present. - src/service/github/client.rs requires fixing the expect() call on line 47
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| src/service/github/client.rs | 4/5 | New GitHub data service with caching and retry logic; contains an expect call that violates custom rules |
| src/handler/http/request/re_pattern/mod.rs | 5/5 | New built-in patterns API endpoint with search/tag filtering and improved duplicate name handling in save endpoint |
| src/service/db/re_pattern.rs | 5/5 | Auto-resolves duplicate pattern names with numeric suffixes (e.g., "Pattern (2)"); clean retry logic with max 100 attempts |
| web/src/components/settings/BuiltInPatternsTab.vue | 5/5 | New Vue component for browsing and importing built-in patterns; includes search, tag filtering, preview dialog |
Sequence Diagram
sequenceDiagram
participant UI as BuiltInPatternsTab.vue
participant API as GET /api/{org}/re_patterns/built-in
participant Adapter as PyWhatAdapter
participant Service as GitHubDataService
participant Cache as CacheManager
participant GitHub as GitHub Raw URL
UI->>API: Request patterns (search, tags, force_refresh)
API->>Service: Create GitHubDataService
alt force_refresh = true
API->>Service: invalidate_cache(url)
Service->>Cache: Remove cached entry
end
API->>Adapter: fetch_built_in_patterns()
Adapter->>Service: fetch_json(config.regex_patterns_source_url)
Service->>Cache: get(cache_key)
alt Cache hit and not expired
Cache-->>Service: Return cached data
else Cache miss or expired
Service->>Service: fetch_with_retry(url)
loop Retry logic (max 3 attempts)
Service->>GitHub: HTTP GET request
GitHub-->>Service: JSON response
end
Service->>Cache: set(cache_key, data, TTL=3600s)
end
Service-->>Adapter: Parsed GenericPattern[]
Adapter->>Adapter: Transform to BuiltInPatternResponse[]
Adapter-->>API: Return patterns
API->>Adapter: filter_by_search(patterns, query)
API->>Adapter: filter_by_tags(patterns, tags)
API-->>UI: Return filtered patterns with metadata
UI->>UI: Display patterns with search/filter UI
alt User imports pattern
UI->>UI: Emit import-patterns event
UI->>API: POST /api/{org}/re_patterns
API->>DB: add(PatternEntry)
alt Duplicate name detected
DB-->>API: UniqueViolation error
API->>API: Append suffix (e.g., "Name (2)")
API->>DB: Retry with new name
end
DB-->>API: Created pattern with final name
API-->>UI: Success with final name
end
16 files reviewed, 1 comment
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 364 | 344 | 0 | 19 | 1 | 95% | 4m 45s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 364 | 344 | 0 | 19 | 1 | 95% | 4m 41s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 364 | 343 | 0 | 19 | 2 | 94% | 4m 38s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 364 | 342 | 0 | 19 | 3 | 94% | 4m 39s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 364 | 343 | 0 | 19 | 2 | 94% | 4m 37s |
a4ce7d8 to
ccbedbf
Compare
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 343 | 0 | 19 | 3 | 94% | 5m 17s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 342 | 0 | 19 | 4 | 94% | 5m 12s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 343 | 0 | 19 | 3 | 94% | 5m 12s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 343 | 0 | 19 | 3 | 94% | 5m 14s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 342 | 0 | 19 | 4 | 94% | 4m 37s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 342 | 0 | 19 | 4 | 94% | 4m 38s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 341 | 0 | 19 | 5 | 93% | 4m 40s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 345 | 0 | 19 | 1 | 95% | 4m 38s |
| log::warn!("Cache full, clearing oldest entries"); | ||
| cache.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not relevant now, but this will flush all the cache not just the oldest entries till we meet size requirements.
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 343 | 0 | 19 | 3 | 94% | 4m 38s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 343 | 0 | 19 | 3 | 94% | 4m 39s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 342 | 0 | 19 | 4 | 94% | 4m 38s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 366 | 344 | 0 | 19 | 3 | 94% | 4m 40s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 376 | 353 | 0 | 21 | 2 | 94% | 4m 31s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 376 | 353 | 0 | 21 | 2 | 94% | 4m 31s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 376 | 354 | 0 | 21 | 1 | 94% | 4m 32s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 376 | 352 | 0 | 21 | 3 | 94% | 4m 32s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 376 | 354 | 0 | 21 | 1 | 94% | 4m 33s |
PR Type
Enhancement, Documentation
Description
Add built-in regex patterns fetch API
Introduce GitHub data service with caching
Handle duplicate pattern names gracefully
UI to browse and import built-ins
Diagram Walkthrough
flowchart LR cfg["Config: regex_patterns_source_url"] -- used by --> ghsvc["GitHubDataService"] ghsvc -- fetch_json/cache --> adapter["PyWhatAdapter.fetch_built_in_patterns"] adapter -- returns --> api["GET /api/{org}/re_patterns/built-in"] api -- consumed by --> websvc["web regexPatterns.getBuiltInPatterns()"] websvc -- powers --> ui["BuiltInPatternsTab.vue"] ui -- import selected --> importFlow["ImportRegexPattern.vue"] importFlow -- create --> dbsvc["db::re_pattern::add (duplicate-safe)"]File Walkthrough
3 files
Add source URL for built-in regex patternsRegister built-in patterns and reorder routesExport github service module10 files
New endpoint to fetch built-in patterns and improved save responseAdd duplicate-safe creation with auto-suffix namesAdd adapters module export for pattern formatsImplement generic pattern adapter and filtersIntroduce in-memory cache with eviction and statsAdd GitHub data client with retry and cachingDefine service config, errors, and cache typesAdd client method to fetch built-in patternsNew UI to search, filter, preview, import patternsIntegrate built-in tab and import flow updates3 files
Expose GitHub service modules and typesDocument regex patterns source URL env varAdd i18n strings for built-in patterns UI