Update WM archiver script: add optional target parameter#3221
Update WM archiver script: add optional target parameter#3221arkid15r merged 5 commits intovacanza:devfrom
Conversation
|
Caution Review failedFailed to post review comments Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. Walkthroughscripts/archive_links.py now accepts an optional positional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This pull request adds an optional path argument to the archive_links.py script, enabling users to target specific files or directories for link archiving instead of scanning the entire holidays library. This addresses issue #3220 which requested the ability to run the link archiver on a single file for faster verification during country-specific updates.
Changes:
- Added optional positional
pathargument to the CLI - Implemented logic to handle single file, directory, or default (all holidays) scanning modes
- Maintained backward compatibility by defaulting to full library scan when no path is provided
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
scripts/archive_links.py
Outdated
| if os.path.isfile(target_path): | ||
| print(f"Targeting single file: {target_path}") | ||
| urls = find_hyperlinks_in_file(target_path) | ||
| if urls: | ||
| files_to_urls_data = {target_path: urls} |
There was a problem hiding this comment.
When targeting a single file, the code should validate that the file has one of the allowed extensions defined in EXTENSIONS_TO_SCAN. Currently, scan_directory_for_links filters files by extension, but this single-file path bypasses that validation. This could lead to processing files with unsupported extensions or inconsistent behavior compared to directory scanning. Consider adding a check to ensure the file ends with one of the allowed extensions before processing.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #3221 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 307 307
Lines 18309 18309
Branches 2330 2330
=========================================
Hits 18309 18309 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Actually, this script needs more refactoring later (use |
|
@KJhellico sir, I’ve removed the extra comments and print statements ! |
PPsyrius
left a comment
There was a problem hiding this comment.
LGTM 🛠️
I tried this script earlier and it works perfectly
8412b26
|



Proposed change
Closes #3220
I've updated
scripts/archive_links.pyto accept an optionalpathargument.Right now, the script scans the whole
holidaysfolder by default, which can take a while if we just want to verify references for a specific country file. With this update, you can point the archiver to a single file or sub-directory (e.g.,python scripts/archive_links.py holidays/countries/estonia.py). If no path is provided, it still defaults to scanning everything, so existing workflows won't break.Note on rate limits: I skipped adding a sleep timer to prioritize speed. If you'd prefer I add a delay to avoid potential API 429 errors, just let me know and I'll add it.
Type of change
holidaysfunctionality in general)Checklist
make checklocally; all checks and tests passed.