fix: Prevent broken report_href links by using deterministic report IDs with fixed prefix length#3889
Conversation
with fixed prefix length
📝 WalkthroughWalkthroughReplaces adaptive ID-shortening with a fixed 16-character truncation for report file IDs, applies those shortened IDs to FileRecord entries, adds collision detection that raises a WorkflowError with guidance, and updates ReportHref URL generation and tests to use 16-char IDs. (≤50 words) Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/snakemake/report/__init__.py (1)
458-473: Simplify redundant ID slicing.Since
get_report_idreturns exactly 16 characters, the[:16]slice on line 465 is redundant. Simplify to[rec.id for rec in file_records].The
strict=Trueparameter forzip()requires Python 3.10+, but the project's minimum version is Python 3.7, so it cannot be used.Apply this diff:
- shortened_ids = [rec.id[:16] for rec in file_records] + shortened_ids = [rec.id for rec in file_records]
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/snakemake/common/__init__.py(1 hunks)src/snakemake/report/__init__.py(1 hunks)tests/test_report_href/expected-results/test2.html(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/common/__init__.pysrc/snakemake/report/__init__.py
🪛 Ruff (0.14.8)
src/snakemake/report/__init__.py
467-470: Avoid specifying long messages outside the exception class
(TRY003)
472-472: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (9, macos-latest, py313)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (8, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: apidocs
🔇 Additional comments (2)
src/snakemake/common/__init__.py (1)
62-66: LGTM! Hash truncation aligns with report generation.The 16-character prefix provides 64 bits of entropy, which should be sufficient for most workflows. This change ensures consistency with the
shorten_idsfunction in the report module.tests/test_report_href/expected-results/test2.html (1)
7-8: Test expectations correctly reflect the 16-character hash fix.The shortened hash paths in the expected HTML output validate that
report_hrefnow generates links consistent with Snakemake's report storage paths.
collisions (rather than any duplicate ids).
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/snakemake/report/__init__.py(1 hunks)src/snakemake/script/__init__.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/script/__init__.pysrc/snakemake/report/__init__.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3159
File: docs/tutorial/interaction_visualization_reporting/workdir/workflow/Snakefile:1-1
Timestamp: 2024-10-20T19:42:16.694Z
Learning: The `report` directive at the workflow level is still valid in Snakemake and is not deprecated.
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Applied to files:
src/snakemake/script/__init__.py
🪛 Ruff (0.14.8)
src/snakemake/report/__init__.py
471-474: Avoid specifying long messages outside the exception class
(TRY003)
476-476: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (9, macos-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py313)
🔇 Additional comments (1)
src/snakemake/script/__init__.py (1)
89-103: LGTM!The 16-character truncation of
self._idcorrectly alignsReportHref.__str__with theshorten_idsfunction inreport/__init__.py, ensuring generated links match the actual stored report paths. This directly addresses the brokenreport_hreflinks described in issue #3867.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/snakemake/report/__init__.py (1)
465-475: Collision detection logic is now correct.The fix properly detects collisions caused by shortening by comparing the number of unique shortened IDs against unique full IDs. This addresses the issue from the previous review.
One minor lint suggestion per static analysis (Ruff B905): consider adding
strict=Trueto thezipcall for defensive coding, though both lists are guaranteed to have the same length here.- for rec, short_id in zip(file_records, shortened_ids): + for rec, short_id in zip(file_records, shortened_ids, strict=True): rec.id = short_id
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snakemake/report/__init__.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/report/__init__.py
🧠 Learnings (2)
📚 Learning: 2024-11-12T12:08:20.342Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.
Applied to files:
src/snakemake/report/__init__.py
📚 Learning: 2024-10-14T09:42:11.571Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3140
File: snakemake/dag.py:1308-1308
Timestamp: 2024-10-14T09:42:11.571Z
Learning: In `snakemake/dag.py`, avoid flagging missing lines or indentation issues when there is no clear syntax or logical error to prevent false positives.
Applied to files:
src/snakemake/report/__init__.py
🪛 Ruff (0.14.8)
src/snakemake/report/__init__.py
469-472: Avoid specifying long messages outside the exception class
(TRY003)
474-474: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (9, macos-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: apidocs
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/snakemake/report/__init__.py (1)
465-475: LGTM! Fixed-length shortening correctly implemented.The function now uses a deterministic 16-character prefix for all report IDs and includes collision detection with clear error guidance. The logic correctly identifies collisions introduced by shortening.
Optional refinements for clarity and defensive programming
Consider these minor improvements:
full_ids = [rec.id for rec in file_records] - shortened_ids = [rec.id[:FILE_HASH_PREFIX_LEN] for rec in file_records] + shortened_ids = [rec.id[:FILE_HASH_PREFIX_LEN] for rec in file_records] # We only need to check for collisions that appear because of the shortening - if len(set(shortened_ids)) != len(set(full_ids)): + if len(set(shortened_ids)) < len(set(full_ids)): raise WorkflowError( "Collision detected when shortening report file hashes to 16 characters. " "Please open an issue at https://github.com/snakemake/snakemake/issues/new to request a greater hash length." ) - for rec, short_id in zip(file_records, shortened_ids): + for rec, short_id in zip(file_records, shortened_ids, strict=True): rec.id = short_idChanges:
- Use
<instead of!=for more explicit intent (collision means fewer unique shortened IDs)- Add
strict=Truetozip()for defensive programming (satisfies B905 linter)Both changes are optional refinements—the current implementation is functionally correct.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/snakemake/report/__init__.py(2 hunks)src/snakemake/script/__init__.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/snakemake/script/init.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/report/__init__.py
🧠 Learnings (4)
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Applied to files:
src/snakemake/report/__init__.py
📚 Learning: 2024-11-12T12:08:20.342Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.
Applied to files:
src/snakemake/report/__init__.py
📚 Learning: 2024-10-14T09:42:11.571Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3140
File: snakemake/dag.py:1308-1308
Timestamp: 2024-10-14T09:42:11.571Z
Learning: In `snakemake/dag.py`, avoid flagging missing lines or indentation issues when there is no clear syntax or logical error to prevent false positives.
Applied to files:
src/snakemake/report/__init__.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: Avoid adding input validation or error handling that unnecessarily complicates the code in the `snakemake` codebase, especially when the cases handled don't make sense.
Applied to files:
src/snakemake/report/__init__.py
🪛 Ruff (0.14.8)
src/snakemake/report/__init__.py
469-472: Avoid specifying long messages outside the exception class
(TRY003)
474-474: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (10, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: apidocs
🔇 Additional comments (1)
src/snakemake/report/__init__.py (1)
48-48: FILE_HASH_PREFIX_LEN import is valid.
FILE_HASH_PREFIX_LENis correctly defined insnakemake.scriptas16, and the import at line 48 is properly used in theshorten_idsfunction.
🤖 I have created a release *beep* *boop* --- ## [9.14.6](v9.14.5...v9.14.6) (2026-01-08) ### Bug Fixes * create local clone of git repos for source files from hosting providers ([#3643](#3643)) ([d2f8aba](d2f8aba)) * create potentially missing .snakemake folder in case of very long command lines for spawned jobs ([#3894](#3894)) ([4b431dd](4b431dd)) * make ilp solver enumeration lazy ([#3900](#3900)) ([30e1509](30e1509)) * Prevent broken report_href links by using deterministic report IDs with fixed prefix length ([#3889](#3889)) ([6d8f4d8](6d8f4d8)) * refactor LoggerManager setup and scope ([#3851](#3851)) ([f46d904](f46d904)) * yield proper error message in case a local git source file is not retrievable ([#3892](#3892)) ([ed79cae](ed79cae)) ### Documentation * explain how to pass nested config via CLI ([#3885](#3885)) ([9d8c539](9d8c539)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This PR makes snakemake shorten report hashes to 16 characters and also makes
report_href()use the 16 character long prefix which was not done before. Collisions are detected when shortening IDs and an error is raised directing users to open an issue to request a longer hash length. Fixes #3867.Summary by CodeRabbit
New Features
Bug Fixes
Style
✏️ Tip: You can customize this high-level summary in your review settings.