fix: create local clone of git repos for source files from hosting providers#3643
fix: create local clone of git repos for source files from hosting providers#3643johanneskoester merged 27 commits intomainfrom
Conversation
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the 📝 WalkthroughWalkthroughAdds git-backed hosting support and extends SourceFile APIs (check, format, endswith, replace_suffix); introduces HostedGitRepo and HostingProviderFile with auth/cache_path/checkout/open; propagates cache_path through SourceCache; updates conda env spec APIs and callers; centralizes I/O path checks. Changes
Sequence DiagramsequenceDiagram
participant Workflow
participant SourceCache
participant HostingProviderFile
participant HostedGitRepo
participant RemoteGit
Workflow->>SourceCache: request cache entry for source_file
SourceCache->>HostingProviderFile: set cache_path (if hosted)
Workflow->>HostingProviderFile: open() / get_path_or_uri()
HostingProviderFile->>HostedGitRepo: ensure repo checkout / fetch(ref)
HostedGitRepo->>RemoteGit: clone or fetch from remote host
RemoteGit-->>HostedGitRepo: repository data
HostedGitRepo-->>HostingProviderFile: provide checked-out file path
HostingProviderFile-->>Workflow: return file handle or path
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)
211-218: Consider cleanup on move failureIf
shutil.movefails, the temporary directory might not be cleaned up properly since it was created with a custom prefix.Consider wrapping the move operation in a try-except to ensure cleanup:
with tempfile.TemporaryDirectory(prefix=f"{self.repo_clone}.") as tmpdir: # the clone is not atomic, hence we do that in a temporary directory Repo.clone_from(repo_url, to_path=tmpdir) - # move is atomic, so we can safely move the directory to the final - # location - shutil.move(tmpdir, self.repo_clone) + # move is atomic, so we can safely move the directory to the final + # location + try: + shutil.move(tmpdir, self.repo_clone) + except Exception: + # Ensure cleanup even if move fails + raise
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/snakemake/sourcecache.py(10 hunks)src/snakemake/workflow.py(3 hunks)tests/tests.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
tests/tests.pysrc/snakemake/workflow.pysrc/snakemake/sourcecache.py
🪛 Ruff (0.11.9)
src/snakemake/sourcecache.py
31-31: Undefined name TYPE_CHECKING
(F821)
381-385: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🪛 Pylint (3.3.7)
src/snakemake/sourcecache.py
[error] 31-31: Undefined variable 'TYPE_CHECKING'
(E0602)
[refactor] 232-238: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[refactor] 279-279: Too many instance attributes (8/7)
(R0902)
[refactor] 284-284: Too many arguments (8/5)
(R0913)
[refactor] 284-284: Too many positional arguments (8/5)
(R0917)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: tests (10, macos-latest, py312)
- GitHub Check: tests (10, windows-latest, py312)
- GitHub Check: tests (9, macos-latest, py312)
- GitHub Check: tests (6, windows-latest, py312)
- GitHub Check: tests (9, windows-latest, py312)
- GitHub Check: tests (8, windows-latest, py312)
- GitHub Check: tests (7, windows-latest, py312)
- GitHub Check: tests (4, windows-latest, py312)
- GitHub Check: tests (4, macos-latest, py312)
- GitHub Check: tests (3, windows-latest, py312)
- GitHub Check: tests (5, windows-latest, py312)
- GitHub Check: tests (1, windows-latest, py312)
- GitHub Check: tests (2, windows-latest, py312)
🔇 Additional comments (8)
src/snakemake/sourcecache.py (4)
242-278: Well-structured checkout managementThe implementation properly handles different checkout scenarios and includes good error recovery with fetch retry. The staleness tracking and assertion in
source_pathhelp catch programming errors early.
327-333: Good validation for host parameterThe validation ensures the host is provided as a domain name without protocol prefix, which prevents URL construction errors later. The error message provides clear context about what went wrong.
342-350: Proper cache path managementThe implementation correctly manages cache paths with appropriate assertions and directory structure. The integration with SourceCache ensures the cache path is set before use.
Also applies to: 524-526
430-434: Clean authentication implementationThe auth properties provide a clean abstraction for authentication tokens, correctly formatting them with "@" suffix for URL construction.
Also applies to: 443-447
tests/tests.py (1)
2328-2331: New test for GitHub module importsThe test follows the established pattern and appropriately tests the new functionality for importing Python modules from GitHub-hosted sources.
src/snakemake/workflow.py (3)
124-124: LGTM: Import addition supports new functionality.The
HostingProviderFileimport is correctly added to support the hosted git repository feature being introduced.
582-584: LGTM: Proper cache path setup for hosted git repositories.The cache path assignment ensures that
HostingProviderFileinstances use the workflow's source cache, enabling proper caching of hosted git repositories.
610-615: LGTM: Extends module import support to hosted git repositories.The modification correctly extends the existing
sys.pathinsertion logic to includeHostingProviderFileinstances, enabling Python module imports from hosted git repositories while maintaining consistency withLocalSourceFilehandling.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/snakemake/sourcecache.py (1)
377-386: Preserve exception chain for better debugging.When re-raising exceptions, use
raise ... from eto preserve the original exception chain.def _checkout(self): try: return self.hosted_repo.checkout(self.branch, self.commit or self.tag) except Exception as e: raise WorkflowError( "Failed to clone/checkout git repository " f"{self.host}/{self.repo} at {self.branch or self.commit or self.tag}.", - e, - ) + ) from e
🧹 Nitpick comments (2)
src/snakemake/sourcecache.py (2)
190-240: Well-designed git repository management class.The
HostedGitRepoclass provides solid functionality with atomic cloning operations and proper caching. The design prevents repository corruption through temporary directory usage and implements efficient checkout management.Consider these improvements:
- More specific exception handling in clone operation - The current broad exception handling in checkout could mask specific git errors
- Explicit cleanup documentation - While
tempfile.TemporaryDirectoryhandles cleanup automatically, documenting the atomic operation guarantees would help maintainability
242-277: Solid checkout management implementation.The
HostedGitRepoCheckoutclass effectively manages git checkouts with proper branch updating logic and stale state tracking.Consider improving exception handling specificity.
The broad exception handling at line 261 could mask important git errors. Consider catching specific git exceptions instead:
- except Exception: + except git.exc.GitCommandError:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snakemake/sourcecache.py(10 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
src/snakemake/sourcecache.py
🪛 Pylint (3.3.7)
src/snakemake/sourcecache.py
[refactor] 232-238: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[refactor] 279-279: Too many instance attributes (8/7)
(R0902)
[refactor] 284-284: Too many arguments (8/5)
(R0913)
[refactor] 284-284: Too many positional arguments (8/5)
(R0917)
🪛 Ruff (0.11.9)
src/snakemake/sourcecache.py
381-385: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
⏰ Context from checks skipped due to timeout of 90000ms (35)
- GitHub Check: tests (7, windows-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (10, windows-latest, py312)
- GitHub Check: tests (9, windows-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (5, windows-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (9, macos-latest, py312)
- GitHub Check: tests (8, windows-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (2, windows-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (4, macos-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (5, macos-latest, py312)
- GitHub Check: tests (4, windows-latest, py312)
- GitHub Check: tests (3, windows-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (3, macos-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (2, macos-latest, py312)
- GitHub Check: tests (1, windows-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py311)
🔇 Additional comments (11)
src/snakemake/sourcecache.py (11)
12-12: Import addition looks good.This addresses the missing
TYPE_CHECKINGimport flagged in previous reviews.
31-34: TYPE_CHECKING block properly implemented.The conditional import block for git module is correctly structured to avoid runtime import costs while providing type hints.
286-293: Constructor enhancement looks good.The addition of
cache_pathparameter and improved type hints enhance the class functionality. The optional parameters maintain backward compatibility while enabling better dependency injection.The pylint warnings about argument count are acceptable given the functional requirements.
327-333: Excellent input validation for host format.This validation prevents common user errors and provides clear guidance on the expected host format. The early validation approach follows fail-fast principles.
337-361: Well-designed property implementations.The properties provide clean abstractions for authentication, cache management, and lazy repository initialization. The cache path setter automatically handling the subdirectory is a nice touch for encapsulation.
369-376: Good delegation to checkout system.The methods properly delegate path and modification time queries to the checkout system, maintaining clean separation of concerns.
399-400: Constructor calls properly updated.The addition of
cache_pathparameter maintains consistency with the updated constructor signature.Also applies to: 415-416
425-434: Clean refactoring of GitHub authentication.The refactoring properly delegates authentication to the base class while maintaining GitHub-specific token handling. The removal of redundant methods improves maintainability.
442-446: Consistent GitLab authentication refactoring.The refactoring mirrors the GitHub implementation, maintaining consistency while providing GitLab-specific token handling.
491-491: Improved cache directory naming.The change to "snakemake-runtime-cache" provides clearer identification and reduces potential conflicts with other tools.
524-526: Proper cache path initialization.Setting the cache_path on HostingProviderFile instances ensures proper initialization before caching operations. This enables the dependency injection pattern used by the new git repository management system.
|
Please format your code with |
until we get something better, e.g. from snakemake/snakemake#3643
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/snakemake/sourcecache.pysrc/snakemake/workflow.pytests/tests.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/snakemake/workflow.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
tests/tests.pysrc/snakemake/sourcecache.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Applied to files:
tests/tests.py
📚 Learning: 2024-11-07T00:32:44.137Z
Learnt from: mbhall88
Repo: snakemake/snakemake PR: 3188
File: tests/test_script/scripts/test.sh:4-4
Timestamp: 2024-11-07T00:32:44.137Z
Learning: In test scripts within the Snakemake project, concise code is preferred over verbose error handling when simplicity suffices.
Applied to files:
tests/tests.py
📚 Learning: 2024-11-12T20:22:54.184Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3204
File: tests/test_cores_cluster/qsub:1-6
Timestamp: 2024-11-12T20:22:54.184Z
Learning: In the Snakemake codebase, the `tests/test_cores_cluster/qsub` script is a dummy script for testing, and input validation and error handling are not required in such scripts.
Applied to files:
tests/tests.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-06T14:09:26.494Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-06T14:09:26.494Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/sourcecache.py
🧬 Code graph analysis (1)
src/snakemake/sourcecache.py (1)
src/snakemake/workflow.py (1)
source_path(1551-1573)
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
266-266: Do not catch blind exception: Exception
(BLE001)
333-337: Avoid specifying long messages outside the exception class
(TRY003)
385-385: Do not catch blind exception: Exception
(BLE001)
386-390: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
386-390: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (1, macos-latest, py313)
🔇 Additional comments (11)
tests/tests.py (2)
2495-2497: LGTM! Clean test addition.The new test follows the established pattern and correctly tests the GitHub module import functionality introduced in this PR.
2710-2710: LGTM! Appropriate platform skip.The decorator correctly skips this test on Windows due to filename length limitations, with a clear explanatory comment.
src/snakemake/sourcecache.py (9)
12-12: LGTM! Proper type checking import.The
TYPE_CHECKINGimport is correctly used to conditionally import thegitmodule for type hints without adding a runtime dependency.Also applies to: 32-34
195-245: LGTM! Solid git repository management implementation.The
HostedGitRepoclass properly handles:
- Lock-free cloning using temporary directories with atomic moves
- Checkout caching to avoid redundant operations
- Authentication via URL tokens (acceptable per coding guidelines)
Based on learnings, embedding authentication tokens in Git repository URLs is accepted in the Snakemake codebase.
247-282: LGTM! Well-designed checkout management.The
HostedGitRepoCheckoutclass handles:
- Branch updates when repository existed previously
- Fallback logic with fetch when initial checkout fails
- Staleness tracking to prevent using outdated checkouts
- Correct mtime implementation based on local clone
The bare
Exceptioncatch at line 266 provides robustness for the fallback fetch-and-checkout logic, which is acceptable here.
332-337: LGTM! Good validation for common user error.The host validation correctly prevents users from including protocol prefixes, with a clear error message explaining the expected format.
346-366: LGTM! Clean property implementation.The lazy initialization of
hosted_repois appropriate for expensive git operations, and thecache_pathproperty correctly enforces that it's set before use (bySourceCache).
382-390: Exception handling follows project conventions.The exception handling correctly wraps git errors in
WorkflowErrorwith context about which repository and ref failed. The absence of exception chaining (raise ... from e) is intentional, as per Snakemake's backward compatibility guidelines.
428-439: LGTM! Clean GitHub-specific implementation.The
authproperty properly returns the token with "@" suffix when available, and the default host assignment simplifies usage for standard GitHub.
447-458: LGTM! Consistent GitLab implementation.The
authproperty follows the same pattern asGithubFile, providing consistent authentication handling across hosting providers.
515-515: LGTM! Proper cache path propagation.The runtime cache directory rename to "snakemake-runtime-cache" reduces naming collision risk, and setting
cache_pathforHostingProviderFileinstances ensures the property is properly initialized before use.Also applies to: 548-549
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/snakemake/sourcecache.py (1)
264-409: Critical: Missing abstract method implementation.
HostingProviderFileextendsSourceFilewhich declaresget_path_or_uri(self, secret_free: bool)as abstract (line 50), but this class does not implement it. WhileGithubFileandGitlabFilesubclasses provide implementations, any code working directly withHostingProviderFileinstances will fail withTypeError.🔎 Proposed fix
If
HostingProviderFileis intended to be abstract (only instantiated via subclasses), consider making it explicitly abstract by inheriting fromABCand decoratingget_path_or_uriwith@abstractmethod.Alternatively, if it should be instantiable, implement the method:
+ def get_path_or_uri(self, secret_free: bool = True) -> str: + checkout = self._checkout() + return str(checkout.source_path(self.path))However, based on the code structure, it appears this class is meant to be abstract, so the first approach is recommended.
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)
229-246: Optional: Move success returns to else blocks for clarity.Static analysis suggests moving the
return Truestatements toelseblocks to make the success path more explicit.🔎 Proposed refactor
def ref_exists(self, ref: str): import git try: self.repo.git.rev_parse("--verify", ref) - return True except git.GitCommandError: return False + else: + return True def file_exists(self, path: str, ref: str): try: # Get the tree for the specific revision tree = self.repo.commit(ref).tree # Attempt to access the file by path tree[path] - return True except KeyError: return False + else: + return True
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/snakemake/sourcecache.pytests/tests.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/tests.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/sourcecache.py
🧠 Learnings (3)
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-06T14:09:26.494Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-06T14:09:26.494Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/sourcecache.py
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
234-234: Consider moving this statement to an else block
(TRY300)
244-244: Consider moving this statement to an else block
(TRY300)
313-317: Avoid specifying long messages outside the exception class
(TRY003)
372-374: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
372-374: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (8, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (1, macos-latest, py313)
🔇 Additional comments (6)
src/snakemake/sourcecache.py (6)
195-224: LGTM: Robust lock-free cloning design.The lock-free cloning pattern using a temporary directory with atomic move is well-designed and prevents race conditions during concurrent access.
312-317: Good defensive validation for host parameter.The check prevents users from accidentally including protocol prefixes in the host parameter, which would break git URL construction.
414-422: LGTM: Clean auth property implementation.The
authproperty cleanly provides authentication credentials in the format expected byHostedGitRepo, with proper handling of missing tokens.
437-441: LGTM: Consistent auth implementation.The
authproperty implementation mirrorsGithubFile, providing consistency across hosting providers.
531-533: LGTM: Proper cache_path propagation.Setting
cache_pathonHostingProviderFileinstances before computing their cache path ensures the git repository cache is properly initialized.
604-605: LGTM: Clean delegation for git-hosted sources.Delegating to
source_file.open()forHostingProviderFileinstances keeps the caching logic clean and leverages the new git-backed file access.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
src/snakemake/sourcecache.py (1)
415-427: Missing runtime import forgitin exception handler.
git.GitCommandErroron line 424 requiresgitto be imported at runtime, but it's only imported underTYPE_CHECKING. This will raise aNameErrorwhen the exception occurs.🔎 Proposed fix
def open(self) -> io.BytesIO: + import git + if not self.hosted_repo.ref_exists( self.ref ) or not self.hosted_repo.file_exists(path=self.path, ref=self.ref): self.hosted_repo.fetch() try: return io.BytesIO( self.hosted_repo.repo.git.show(f"{self.ref}:{self.path}").encode() ) except git.GitCommandError as e: raise WorkflowError( f"Failed to get cached git source file {self.repo}:{self.path}: {e}. " )
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)
304-304: Consider annotating shared cache withClassVar.The
_hosted_reposdictionary is shared across all instances. UsingClassVarmakes this intent explicit and satisfies type checkers.🔎 Proposed fix
+from typing import ClassVar ... - _hosted_repos: Dict[str, HostedGitRepo] = {} + _hosted_repos: ClassVar[Dict[str, HostedGitRepo]] = {}
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/snakemake/common/__init__.pysrc/snakemake/deployment/conda.pysrc/snakemake/sourcecache.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/common/__init__.pysrc/snakemake/sourcecache.pysrc/snakemake/deployment/conda.py
🧠 Learnings (7)
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 2985
File: tests/tests.py:2051-2051
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the Snakemake repository, avoid suggesting return type annotations for test functions.
Applied to files:
src/snakemake/common/__init__.py
📚 Learning: 2024-08-13T16:22:09.641Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3014
File: snakemake/workflow.py:1147-1147
Timestamp: 2024-08-13T16:22:09.641Z
Learning: Avoid suggesting type annotations for functions that are inside methods in the Snakemake codebase.
Applied to files:
src/snakemake/common/__init__.py
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3014
File: snakemake/workflow.py:0-0
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the Snakemake project, type annotations are not necessary for internal helper functions.
Applied to files:
src/snakemake/common/__init__.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-06T14:09:26.494Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-06T14:09:26.494Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-11T13:12:35.827Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/deployment/conda.py:85-88
Timestamp: 2024-10-11T13:12:35.827Z
Learning: In the `snakemake/deployment/conda.py` file, within the `Env` class, validation of `env_dir` occurs later in the code, so it's unnecessary to validate `env_dir` in the `__init__` method.
Applied to files:
src/snakemake/deployment/conda.py
🧬 Code graph analysis (3)
src/snakemake/common/__init__.py (1)
src/snakemake/sourcecache.py (1)
abspath(143-144)
src/snakemake/sourcecache.py (2)
src/snakemake/common/__init__.py (4)
LockFreeWritableFile(391-432)is_local_file(131-132)parse_uri(135-150)smart_join(153-170)src/snakemake/common/git.py (1)
split_git_path(7-14)
src/snakemake/deployment/conda.py (2)
src/snakemake/io/__init__.py (1)
file(459-466)src/snakemake/sourcecache.py (5)
apply_wildcards(69-69)apply_wildcards(110-111)apply_wildcards(131-132)apply_wildcards(179-186)apply_wildcards(351-360)
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
37-37: Undefined name Mapping
(F821)
267-267: Consider moving this statement to an else block
(TRY300)
277-277: Consider moving this statement to an else block
(TRY300)
304-304: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
385-389: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
385-389: Avoid specifying long messages outside the exception class
(TRY003)
425-427: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
425-427: Avoid specifying long messages outside the exception class
(TRY003)
src/snakemake/deployment/conda.py
893-893: Unused method argument: rule
(ARG002)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: tests (10, macos-latest, py313)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (8, macos-latest, py313)
- GitHub Check: tests (5, macos-latest, py313)
🔇 Additional comments (8)
src/snakemake/common/__init__.py (1)
153-170: LGTM!The return type annotation
-> straccurately reflects the function's behavior across all code paths.src/snakemake/sourcecache.py (5)
68-79: LGTM!The abstract
apply_wildcardsmethod enforces implementation across allSourceFilesubclasses, and thejoinmethod correctly uses the base class's path for joining operations.
378-399: LGTM!The
hosted_repoproperty correctly caches instances and validates the host format. The exception handling follows the project's convention. Based on learnings, exception chaining is intentionally avoided in the Snakemake codebase.
473-496: LGTM!The
authproperty implementations correctly format tokens for HTTPS Git URL authentication. Based on learnings, including authentication tokens in Git repository URLs is an accepted pattern in this codebase.
554-554: Runtime cache directory renamed.The rename from
"runtime-cache"to"snakemake-runtime-cache"adds a namespace prefix. Since the runtime cache is ephemeral (created as aTemporaryDirectory), this change should have no impact on existing workflows.
586-598: LGTM!The
_cache_entrymethod correctly propagatescache_pathtoHostingProviderFileinstances, enabling git-backed source resolution.src/snakemake/deployment/conda.py (2)
893-898: LGTM with note on unused parameter.The refactored
CondaEnvFileSpeccorrectly delegates to theSourceFileabstraction. Theruleparameter in__init__is unused but retained for API consistency with sibling classes (CondaEnvDirSpec,CondaEnvNameSpec), which is acceptable.
916-918: No action needed. Thecontains_wildcardfunction explicitly converts its parameter to a string usingstr(path), andSourceFilehas a proper__str__implementation that returns the path. The code works correctly withSourceFileobjects.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/snakemake/deployment/conda.py (2)
926-931: Critical:CondaEnvDirSpec.apply_wildcardssignature doesn't match abstract method.This method has signature
(self, wildcards, rule)but the abstract method at line 857 declares(self, wildcards) -> Self. This signature mismatch causes aTypeErrorat runtime when called fromrules.pyline 1238.🐛 Proposed fix
- def apply_wildcards(self, wildcards, rule): + def apply_wildcards(self, wildcards) -> Self: filepath = self.path.apply_wildcards(wildcards) if is_local_file(filepath): # Normalize 'file:///my/path.yml' to '/my/path.yml' filepath = parse_uri(filepath).uri_path - return CondaEnvDirSpec(filepath, rule) + return CondaEnvDirSpec(filepath, rule=self.path.rule)
964-965: Critical:CondaEnvNameSpec.apply_wildcardssignature doesn't match abstract method.This method has signature
(self, wildcards, _)but the abstract method at line 857 declares(self, wildcards) -> Self. The unused_parameter must be removed, and the return type annotation added.🐛 Proposed fix
- def apply_wildcards(self, wildcards, _): + def apply_wildcards(self, wildcards) -> Self: return CondaEnvNameSpec(apply_wildcards(self.name, wildcards))
🤖 Fix all issues with AI agents
In @src/snakemake/deployment/conda.py:
- Line 9: The code imports Self from typing which only exists on Python 3.11+;
add a conditional fallback to import Self from typing_extensions when
typing.Self isn't available so the module works on Python 3.7–3.10; update the
import block in src/snakemake/deployment/conda.py (and do the same in
src/snakemake/sourcecache.py) to try importing Self from typing and on
ImportError import Self from typing_extensions.
In @src/snakemake/sourcecache.py:
- Around line 464-466: The hosted_repo property is inconsistent: the getter
returns from the instance cache self._hosted_repos[self.repo] while the setter
writes to a different attribute self._hosted_repo, so setting has no effect; fix
by either (A) removing the setter if unused, or (B) make the setter update the
same cache key (assign hosted_repo to self._hosted_repos[self.repo] or
initialize self._hosted_repos if missing) so that property getter and setter
operate on the same storage; update or remove the hosted_repo.setter
accordingly.
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)
353-353: Annotate mutable class attribute withClassVar.This mutable class attribute is intentionally shared across instances as a cache. Adding
ClassVarannotation makes this intent explicit and silences static analysis warnings.♻️ Suggested fix
+from typing import ClassVar ... - _hosted_repos: Dict[str, HostedGitRepo] = {} + _hosted_repos: ClassVar[Dict[str, HostedGitRepo]] = {}
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/snakemake/deployment/conda.pysrc/snakemake/io/__init__.pysrc/snakemake/rules.pysrc/snakemake/sourcecache.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/io/__init__.pysrc/snakemake/rules.pysrc/snakemake/deployment/conda.pysrc/snakemake/sourcecache.py
🧠 Learnings (5)
📚 Learning: 2024-10-11T12:49:08.705Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/rules.py:1105-1105
Timestamp: 2024-10-11T12:49:08.705Z
Learning: In `snakemake/rules.py`, within the `Rule` class, `self.basedir` is a custom class (not a `pathlib.Path` object) that has a `join()` method. Therefore, using `self.basedir.join(conda_env)` is appropriate in this context.
Applied to files:
src/snakemake/rules.py
📚 Learning: 2024-10-11T13:12:35.827Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/deployment/conda.py:85-88
Timestamp: 2024-10-11T13:12:35.827Z
Learning: In the `snakemake/deployment/conda.py` file, within the `Env` class, validation of `env_dir` occurs later in the code, so it's unnecessary to validate `env_dir` in the `__init__` method.
Applied to files:
src/snakemake/rules.pysrc/snakemake/deployment/conda.py
📚 Learning: 2024-10-06T14:09:26.494Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-06T14:09:26.494Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/deployment/conda.pysrc/snakemake/sourcecache.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
🧬 Code graph analysis (3)
src/snakemake/rules.py (1)
src/snakemake/deployment/conda.py (6)
CondaEnvFileSpec(882-914)CondaEnvDirSpec(917-957)apply_wildcards(857-857)apply_wildcards(886-888)apply_wildcards(926-931)apply_wildcards(964-965)
src/snakemake/deployment/conda.py (1)
src/snakemake/sourcecache.py (10)
replace_suffix(85-85)replace_suffix(123-128)replace_suffix(154-159)replace_suffix(212-223)replace_suffix(400-413)apply_wildcards(79-79)apply_wildcards(133-134)apply_wildcards(164-165)apply_wildcards(228-235)apply_wildcards(418-427)
src/snakemake/sourcecache.py (3)
src/snakemake/common/__init__.py (4)
LockFreeWritableFile(391-432)is_local_file(131-132)parse_uri(135-150)smart_join(153-170)src/snakemake/common/git.py (1)
split_git_path(7-14)src/snakemake/io/__init__.py (4)
apply_wildcards(799-839)apply_wildcards(1175-1184)open(397-416)get(1913-1918)
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
316-316: Consider moving this statement to an else block
(TRY300)
326-326: Consider moving this statement to an else block
(TRY300)
353-353: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
452-456: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
452-456: Avoid specifying long messages outside the exception class
(TRY003)
492-494: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
492-494: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (45)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py311)
🔇 Additional comments (15)
src/snakemake/io/__init__.py (2)
468-471: LGTM!Clean refactoring that delegates to the new standalone
check()function while maintaining the same interface.
953-1001: LGTM!The new standalone
check()function properly centralizes all file path validation logic. The implementation correctly handles the optionalruleparameter for context and appropriately skips the double-slash check for storage files.src/snakemake/sourcecache.py (9)
12-38: LGTM!The imports are correctly organized, with the
TYPE_CHECKINGguard properly protecting thegitimport for type hints only.
60-116: LGTM!The abstract method additions properly define the interface for all
SourceFilesubclasses. UsingSelfreturn type correctly enables proper type inference for subclass method chaining.
537-554: LGTM!The
GithubFileclass correctly defaults the host and provides theauthproperty for git URL authentication.
556-578: LGTM!The
GitlabFileclass correctly mirrors theGithubFilepattern with appropriate defaults and authentication handling.
656-658: LGTM!Properly propagates
cache_pathtoHostingProviderFileinstances before computing their cache path, enabling the git-hosted file functionality.
729-730: LGTM!Cleanly integrates git-hosted file support by delegating to
HostingProviderFile.open()method.
196-272: LGTM!The
LocalGitFileclass correctly implements all new abstract methods following the established patterns.
474-480: LGTM!The
mtime()implementation correctly accesses the underlyinggit.Repoviaself.hosted_repo.repoto get the last commit date for the file path.
336-342: Undefined attributeself.repo- should beself.repo_name.Line 287 stores the repo as
self.repo_name, but line 338 referencesself.repowhich doesn't exist. This will cause anAttributeErrorat runtime.🐛 Proposed fix
with self._tmpdir() as tmpdir: logger.info( - f"Fetching latest changes of {self.host}/{self.repo} to {self.repo_clone}" + f"Fetching latest changes of {self.host}/{self.repo_name} to {self.repo_clone}" ) shutil.copytree(self.repo_clone, tmpdir) Repo(tmpdir).remotes.origin.fetch() shutil.move(tmpdir, self.repo_clone)Likely an incorrect or invalid review comment.
src/snakemake/deployment/conda.py (3)
857-857: LGTM: Abstract method return type annotation is correct.The addition of
-> Selfto the abstract method signature is appropriate and helps ensure type safety across implementations.However, note that concrete implementations at lines 926 (
CondaEnvDirSpec) and 964 (CondaEnvNameSpec) have signature mismatches with extra parameters—this is flagged separately as a critical issue.
883-888: LGTM: Clean refactor to simplified API.The updated
CondaEnvFileSpecimplementation:
- Simplifies the constructor to accept
source_filedirectly- Correctly implements the abstract method signature
- Uses
self.__class__(...)for proper subclass support- Returns
Selffor fluent chainingThis aligns well with the
SourceFileabstraction pattern.
138-145: LGTM: Cleaner delegation to SourceFile API.The refactor to use
self.file.replace_suffix([".yaml", ".yml"], suffix)is cleaner than manual prefix handling. TheNonecheck appropriately handles cases where the conda env file doesn't have the expected extension.src/snakemake/rules.py (1)
1232-1238: Critical: Signature mismatch will cause runtime failures for DIR and NAME env specs.The changes at lines 1232 and 1238 update the API calls to match the new
CondaEnvFileSpecsignature. However, at line 1238,conda_env.apply_wildcards(wildcards)is called polymorphically on three different spec types:
CondaEnvFileSpec(line 1232) - hasapply_wildcards(self, wildcards) -> Self✓CondaEnvNameSpec(line 1234) - hasapply_wildcards(self, wildcards, _)in conda.py line 964 ✗CondaEnvDirSpec(line 1236) - hasapply_wildcards(self, wildcards, rule)in conda.py line 926 ✗The call at line 1238 will raise
TypeError: missing required positional argumentwhenconda_envis aCondaEnvDirSpecorCondaEnvNameSpec.The signatures in conda.py need to be fixed:
CondaEnvDirSpec.apply_wildcardsshould extractrulefromself.path.ruleinstead of taking it as a parameterCondaEnvNameSpec.apply_wildcardsshould remove the unused_parameterBoth must match the abstract method signature declared at conda.py line 857.
⛔ Skipped due to learnings
Learnt from: johanneskoester Repo: snakemake/snakemake PR: 3132 File: snakemake/deployment/conda.py:85-88 Timestamp: 2024-10-11T13:12:35.827Z Learning: In the `snakemake/deployment/conda.py` file, within the `Env` class, validation of `env_dir` occurs later in the code, so it's unnecessary to validate `env_dir` in the `__init__` method.Learnt from: johanneskoester Repo: snakemake/snakemake PR: 3132 File: snakemake/rules.py:1105-1105 Timestamp: 2024-10-11T12:49:08.705Z Learning: In `snakemake/rules.py`, within the `Rule` class, `self.basedir` is a custom class (not a `pathlib.Path` object) that has a `join()` method. Therefore, using `self.basedir.join(conda_env)` is appropriate in this context.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/snakemake/deployment/conda.py (3)
921-926: Add return type annotation for consistency.The
apply_wildcardsmethod should include a-> Selfreturn type annotation to match the abstract method signature on line 857 and be consistent with other implementations likeCondaEnvFileSpec.♻️ Proposed fix
- def apply_wildcards(self, wildcards): + def apply_wildcards(self, wildcards) -> Self: filepath = apply_wildcards(self.path, wildcards)
959-960: Add return type annotation for consistency.The
apply_wildcardsmethod should include a-> Selfreturn type annotation to match the abstract method signature on line 857 and be consistent with other implementations.♻️ Proposed fix
- def apply_wildcards(self, wildcards): + def apply_wildcards(self, wildcards) -> Self: return CondaEnvNameSpec(apply_wildcards(self.name, wildcards))
136-151: Improve error message readability for consistency with other log statements.The refactored
_get_aux_filemethod callsself.file.replace_suffix([".yaml", ".yml"], suffix)on line 138, which is properly implemented across allSourceFilesubclasses.However, the error message on line 142 directly interpolates
self.file, which produces a less user-friendly representation. Useself.file.get_path_or_uri(secret_free=True)instead to match the pattern already established elsewhere in this file (lines 324, 615, 638):if aux_file is None: logger.warning( - f"Conda environment file {self.file} does not end " + f"Conda environment file {self.file.get_path_or_uri(secret_free=True)} does not end " f"on .yaml or .yml. {omit_msg}" )
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snakemake/deployment/conda.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/deployment/conda.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
📚 Learning: 2024-10-11T13:12:35.827Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/deployment/conda.py:85-88
Timestamp: 2024-10-11T13:12:35.827Z
Learning: In the `snakemake/deployment/conda.py` file, within the `Env` class, validation of `env_dir` occurs later in the code, so it's unnecessary to validate `env_dir` in the `__init__` method.
Applied to files:
src/snakemake/deployment/conda.py
📚 Learning: 2024-10-06T14:09:26.494Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-06T14:09:26.494Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/deployment/conda.py
📚 Learning: 2025-09-05T16:17:21.298Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3726
File: src/snakemake/common/__init__.py:413-414
Timestamp: 2025-09-05T16:17:21.298Z
Learning: Snakemake requires Python 3.11 at runtime nowadays. Only imports need to work with Python 3.7 for compatibility reasons.
Applied to files:
src/snakemake/deployment/conda.py
📚 Learning: 2024-10-11T12:49:08.705Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/rules.py:1105-1105
Timestamp: 2024-10-11T12:49:08.705Z
Learning: In `snakemake/rules.py`, within the `Rule` class, `self.basedir` is a custom class (not a `pathlib.Path` object) that has a `join()` method. Therefore, using `self.basedir.join(conda_env)` is appropriate in this context.
Applied to files:
src/snakemake/deployment/conda.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (45)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: apidocs
🔇 Additional comments (2)
src/snakemake/deployment/conda.py (2)
857-857: LGTM! Good addition of return type annotation.Adding
-> Selfto the abstract method signature clarifies the fluent interface pattern and ensures consistency across implementations.
883-888: LGTM! Clean simplification of the implementation.The refactored
CondaEnvFileSpecproperly:
- Removes the unused
ruleparameter- Delegates wildcard expansion to the underlying
source_fileobject- Returns the correct type with
-> Selfannotation- Maintains consistency with the abstract method signature
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/snakemake/workflow.py (1)
1616-1657: Align linemap/included keys with the compile filename to avoid failures for hosted snakefiles with tokensThe new
include()behavior has two important aspects:
Good changes:
- Initializing
snakefile.cache_pathwhensnakefileis aHostingProviderFileensures that git-hosted sources have a well-defined cache root in the workflow’sSourceCache, which is required for the new HostedGitRepo-based cloning.- Treating
HostingProviderFilelikeLocalSourceFileforsys.pathinsertion is consistent with the intention that hosted snakefiles behave like local modules once cloned.Problematic interaction with
linemapsand__code__.co_filename:
_includedandlinemapsare now keyed usingsnakefile.get_path_or_uri(secret_free=True).The compiled code still uses
snakefile.get_path_or_uri(secret_free=False)as the filename:exec( compile(code, snakefile.get_path_or_uri(secret_free=False), "exec"), self.globals, )For local snakefiles these two strings are identical, but for hosted ones (e.g. GitHub/GitLab)
secret_free=Falsemay embed credentials (e.g.https://:[email protected]/...), whilesecret_free=Trueomits them.
Workflow.get_rule_source()relies on exact key equality:sourcefile = func.__code__.co_filename linemap = self.linemaps[sourcefile]With the current change, for token-bearing hosted snakefiles
sourcefilewill be the non-secret-free string, whileself.linemapsis keyed by the secret-free variant, leading to aKeyErrorwhen attempting to populaterule.run_func_src.Similarly,
source_path()checksif calling_file in self._includedusingframe.f_back.f_code.co_filename, which will now miss for such hosted snakefiles and fall back to the less precise heuristic path calculation.To keep secrets out of user-visible paths and maintain correct behavior, the filename passed to
compile()should match the key used in_includedandlinemaps. A minimal fix is to compute a single safe filename once and use it consistently:Suggested fix to keep filenames secret-free and consistent
snakefile = infer_source_file(snakefile, basedir) if not self.modifier.is_module and snakefile in self.included: logger.info(f"Multiple includes of {snakefile} ignored") return - self._included[snakefile.get_path_or_uri(secret_free=True)] = snakefile + safe_snakefile_name = snakefile.get_path_or_uri(secret_free=True) + self._included[safe_snakefile_name] = snakefile self.included_stack.append(snakefile) default_target = self.default_target linemap: Dict[int, int] = dict() - self.linemaps[snakefile.get_path_or_uri(secret_free=True)] = linemap + self.linemaps[safe_snakefile_name] = linemap @@ - exec( - compile(code, snakefile.get_path_or_uri(secret_free=False), "exec"), - self.globals, - ) + exec( + compile(code, safe_snakefile_name, "exec"), + self.globals, + )This preserves the goal of not leaking tokens (because
safe_snakefile_nameis secret-free) and restores the invariant thatfunc.__code__.co_filenamematches the keys used in_includedandlinemaps, soget_rule_source()andsource_path()continue to work for both local and hosted snakefiles, including those that require tokens.
🤖 Fix all issues with AI agents
In @src/snakemake/exceptions.py:
- Line 17: Remove the debugging call breakpoint() from
src/snakemake/exceptions.py (it’s the stray breakpoint() introduced in the
exception formatting/path) so execution is not paused in production; simply
delete the breakpoint() invocation and run the test suite to confirm no runtime
side-effects remain.
In @src/snakemake/sourcecache.py:
- Line 359: The log f-string in the HostedGitRepo fetch/update code uses a
nonexistent attribute self.repo; replace it with the correct attribute
self.repo_name (e.g., update the f-string that currently references self.repo to
use self.repo_name) so the log message reads the correct repository name and
avoids AttributeError.
- Around line 500-520: The except clause in the open method references
git.GitCommandError but git is only imported under TYPE_CHECKING, causing a
NameError at runtime; fix by importing the exception at runtime (e.g., add "from
git import GitCommandError" or "import git" at module scope) and update the
except to catch GitCommandError (or keep git.GitCommandError if you imported
git), ensuring the open method's except block correctly references a real
symbol.
- Around line 295-368: The lock-free clone and fetch can raise shutil.Error if
another process created/moved the destination concurrently; wrap the shutil.move
call in both __init__ (clone path where Repo.clone_from -> shutil.move(tmpdir,
self.repo_clone)) and fetch (shutil.move(tmpdir, self.repo_clone)) in try/except
handling shutil.Error: if the destination already exists, swallow the error and
refresh self._repo (e.g., set self._repo = Repo(self.repo_clone) or reopen after
a short retry), otherwise re-raise; ensure you import or reference shutil.Error
and keep the tmpdir cleanup path intact (methods to touch: __init__, fetch,
using _tmpdir, repo_clone, and _repo).
- Around line 471-492: The hosted_repo property accesses and mutates the shared
class-level dictionary _hosted_repos without synchronization and the hosted_repo
setter mistakenly assigns to self._hosted_repo instead of updating the shared
dict; fix by introducing a class-level threading.Lock (e.g., _hosted_repos_lock)
and wrap the check-create-assign sequence in the hosted_repo getter in a with
_hosted_repos_lock: block (or use _hosted_repos.setdefault(...) under the lock)
so creating/setting HostedGitRepo(self.repo, ...) is atomic, and change the
hosted_repo.setter to update self._hosted_repos[self.repo] = hosted_repo (under
the same lock) instead of assigning self._hosted_repo.
🧹 Nitpick comments (4)
tests/tests.py (1)
2495-2504: Re‑check whether the GitHub module import test should still be globally skippedThis new test exercises Python imports from a GitHub-hosted module but is unconditionally skipped with a reason stating that maintaining a full local GitHub repo is “unsupported for now”. Given this PR introduces git-hosted clones, please confirm whether:
- the scenario in
test_python_import_from_github_moduleis now expected to work, in which case the skip can likely be dropped (or turned into a conditional xfail), or- the test still targets functionality beyond the current implementation, in which case updating the skip reason to be more specific (e.g. referencing the remaining limitation) would help future maintainers.
src/snakemake/script/__init__.py (1)
960-965: Prefer explicitif/elseoverifelsewith...in R preamblesThe new R and RMarkdown preambles correctly distinguish between local and
https?://scriptdirs, but using:is_url <- grepl("^https?://", snakemake@scriptdir) file <- ifelse(is_url, file.path(snakemake@scriptdir, ...), ...) if (!is_url) setwd(snakemake@scriptdir) source(file)is awkward because
ifelse()eagerly evaluates both branches and you’re passing...in two different positions. That’s harder to reason about and could behave unexpectedly if multiple path components are supplied.A clearer and more robust pattern would be an explicit branch, e.g.:
Suggested R/RMarkdown preamble adjustment
- is_url <- grepl("^https?://", snakemake@scriptdir) - file <- ifelse(is_url, file.path(snakemake@scriptdir, ...), ...) - if (!is_url) setwd(snakemake@scriptdir) - source(file) + is_url <- grepl("^https?://", snakemake@scriptdir) + if (is_url) { + file <- file.path(snakemake@scriptdir, ...) + } else { + file <- ... + setwd(snakemake@scriptdir) + } + source(file)This keeps the existing behavior for local scripts, cleanly supports URL-based scriptdirs, and ensures
...is only expanded in the path construction relevant to the actual case.Also applies to: 1073-1078
src/snakemake/sourcecache.py (2)
374-374: Mutable class attribute withoutClassVarannotation.The static analysis correctly identifies that
_hosted_reposis a mutable class attribute that should be annotated withtyping.ClassVarto clarify it's shared across all instances.♻️ Suggested fix
+from typing import TYPE_CHECKING, Any, ClassVar, Dict, List, Optional ... - _hosted_repos: Dict[str, HostedGitRepo] = {} + _hosted_repos: ClassVar[Dict[str, HostedGitRepo]] = {}
460-469: Assert in property getter may be problematic for debugging.The
cache_pathproperty uses an assertion that will fail with a genericAssertionErrorif_cache_pathis not set. This could be confusing when debugging. Consider raising a more descriptive error:♻️ Optional improvement
@property def cache_path(self) -> Path: - assert ( - self._cache_path - ), "bug: cache_path not set, should be done by SourceCache" - + if not self._cache_path: + raise RuntimeError( + "bug: cache_path not set, should be done by SourceCache" + ) return self._cache_path
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
src/snakemake/deployment/conda.pysrc/snakemake/deployment/containerize.pysrc/snakemake/exceptions.pysrc/snakemake/parser.pysrc/snakemake/rules.pysrc/snakemake/script/__init__.pysrc/snakemake/sourcecache.pysrc/snakemake/workflow.pytests/tests.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/deployment/containerize.pysrc/snakemake/workflow.pysrc/snakemake/parser.pysrc/snakemake/exceptions.pysrc/snakemake/deployment/conda.pysrc/snakemake/script/__init__.pysrc/snakemake/rules.pytests/tests.pysrc/snakemake/sourcecache.py
🧠 Learnings (12)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
📚 Learning: 2024-10-11T13:12:35.827Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/deployment/conda.py:85-88
Timestamp: 2024-10-11T13:12:35.827Z
Learning: In the `snakemake/deployment/conda.py` file, within the `Env` class, validation of `env_dir` occurs later in the code, so it's unnecessary to validate `env_dir` in the `__init__` method.
Applied to files:
src/snakemake/deployment/containerize.pysrc/snakemake/deployment/conda.pysrc/snakemake/rules.py
📚 Learning: 2024-10-06T14:09:26.494Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-06T14:09:26.494Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/deployment/containerize.pysrc/snakemake/deployment/conda.pysrc/snakemake/sourcecache.py
📚 Learning: 2024-10-11T12:49:08.705Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/rules.py:1105-1105
Timestamp: 2024-10-11T12:49:08.705Z
Learning: In `snakemake/rules.py`, within the `Rule` class, `self.basedir` is a custom class (not a `pathlib.Path` object) that has a `join()` method. Therefore, using `self.basedir.join(conda_env)` is appropriate in this context.
Applied to files:
src/snakemake/deployment/containerize.pysrc/snakemake/deployment/conda.pysrc/snakemake/rules.py
📚 Learning: 2025-08-28T13:17:37.693Z
Learnt from: Hocnonsense
Repo: snakemake/snakemake PR: 3714
File: src/snakemake/workflow.py:637-665
Timestamp: 2025-08-28T13:17:37.693Z
Learning: In Snakemake, rule.module_globals["__name__"] always exists and defaults to "snakemake.workflow" when not modified, so direct dictionary access is safe without needing .get().
Applied to files:
src/snakemake/workflow.py
📚 Learning: 2025-09-17T04:03:59.943Z
Learnt from: Hocnonsense
Repo: snakemake/snakemake PR: 3714
File: src/snakemake/modules.py:236-241
Timestamp: 2025-09-17T04:03:59.943Z
Learning: In Snakemake's WorkflowModifier, rule filtering via rule_whitelist and rule_exclude_list happens upstream during rule creation in the rule() decorator. The avail_rulename() method returns None for filtered-out rules, causing the entire rule creation to be skipped. This means child_modifier.rule_proxies only contains rules that passed the filtering, so inherit_rule_proxies doesn't need additional filtering logic.
Applied to files:
src/snakemake/workflow.py
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Applied to files:
src/snakemake/parser.pytests/tests.py
📚 Learning: 2025-01-14T14:04:30.554Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3107
File: snakemake/logging.py:510-515
Timestamp: 2025-01-14T14:04:30.554Z
Learning: In Snakemake's logging handlers, KeyboardInterrupt and SystemExit exceptions should be ignored (pass) rather than re-raised, as these are control flow signals that should not disrupt the logging process. The exception variable in the final except block should be kept for potential debugging purposes.
Applied to files:
src/snakemake/exceptions.py
📚 Learning: 2025-09-05T16:17:21.298Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3726
File: src/snakemake/common/__init__.py:413-414
Timestamp: 2025-09-05T16:17:21.298Z
Learning: Snakemake requires Python 3.11 at runtime nowadays. Only imports need to work with Python 3.7 for compatibility reasons.
Applied to files:
src/snakemake/deployment/conda.py
📚 Learning: 2024-11-07T00:32:44.137Z
Learnt from: mbhall88
Repo: snakemake/snakemake PR: 3188
File: tests/test_script/scripts/test.sh:4-4
Timestamp: 2024-11-07T00:32:44.137Z
Learning: In test scripts within the Snakemake project, concise code is preferred over verbose error handling when simplicity suffices.
Applied to files:
tests/tests.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
🧬 Code graph analysis (8)
src/snakemake/deployment/containerize.py (2)
src/snakemake/deployment/conda.py (2)
CondaEnvSpecType(986-1009)from_spec(992-1009)src/snakemake/sourcecache.py (1)
infer_source_file(607-637)
src/snakemake/workflow.py (1)
src/snakemake/sourcecache.py (11)
HostingProviderFile(370-560)cache_path(460-465)cache_path(468-469)infer_source_file(607-637)get_path_or_uri(66-66)get_path_or_uri(147-148)get_path_or_uri(183-184)get_path_or_uri(258-259)get_path_or_uri(575-579)get_path_or_uri(594-604)LocalSourceFile(161-209)
src/snakemake/parser.py (1)
src/snakemake/sourcecache.py (6)
get_path_or_uri(66-66)get_path_or_uri(147-148)get_path_or_uri(183-184)get_path_or_uri(258-259)get_path_or_uri(575-579)get_path_or_uri(594-604)
src/snakemake/deployment/conda.py (2)
src/snakemake/sourcecache.py (21)
replace_suffix(89-91)replace_suffix(132-139)replace_suffix(168-175)replace_suffix(231-244)replace_suffix(424-439)get_path_or_uri(66-66)get_path_or_uri(147-148)get_path_or_uri(183-184)get_path_or_uri(258-259)get_path_or_uri(575-579)get_path_or_uri(594-604)format(80-80)format(144-145)format(180-181)format(249-256)endswith(86-86)endswith(129-130)endswith(165-166)endswith(228-229)endswith(421-422)LocalSourceFile(161-209)src/snakemake/common/__init__.py (2)
is_local_file(131-132)parse_uri(135-150)
src/snakemake/script/__init__.py (1)
src/snakemake/sourcecache.py (11)
get_path_or_uri(66-66)get_path_or_uri(147-148)get_path_or_uri(183-184)get_path_or_uri(258-259)get_path_or_uri(575-579)get_path_or_uri(594-604)format(80-80)format(144-145)format(180-181)format(249-256)infer_source_file(607-637)
src/snakemake/rules.py (1)
src/snakemake/deployment/conda.py (8)
CondaEnvFileSpec(882-914)CondaEnvSpecType(986-1009)CondaEnvNameSpec(955-983)CondaEnvDirSpec(917-952)apply_wildcards(857-857)apply_wildcards(886-888)apply_wildcards(921-926)apply_wildcards(959-960)
tests/tests.py (1)
tests/common.py (2)
run(160-546)dpath(38-41)
src/snakemake/sourcecache.py (5)
src/snakemake/utils.py (1)
format(523-567)src/snakemake/common/__init__.py (3)
is_local_file(131-132)parse_uri(135-150)smart_join(153-170)src/snakemake/exceptions.py (1)
SourceFileError(201-203)src/snakemake/common/git.py (1)
split_git_path(7-14)src/snakemake/io/__init__.py (4)
apply_wildcards(799-839)apply_wildcards(1175-1184)open(397-416)get(1913-1918)
🪛 Ruff (0.14.10)
src/snakemake/deployment/containerize.py
43-47: Avoid specifying long messages outside the exception class
(TRY003)
src/snakemake/exceptions.py
17-17: Trace found: breakpoint used
(T100)
src/snakemake/rules.py
1240-1240: Avoid specifying long messages outside the exception class
(TRY003)
src/snakemake/sourcecache.py
337-337: Consider moving this statement to an else block
(TRY300)
347-347: Consider moving this statement to an else block
(TRY300)
374-374: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
478-482: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
478-482: Avoid specifying long messages outside the exception class
(TRY003)
518-520: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
518-520: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: apidocs
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (10, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (8, macos-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py313)
🔇 Additional comments (23)
src/snakemake/parser.py (1)
1356-1364: Usingsecret_free=TrueforSnakefile.pathis appropriateThis keeps error messages and rule metadata free of embedded tokens while leaving actual file access via
workflow.sourcecache.open(path)unchanged. No functional issues spotted.tests/tests.py (1)
1709-1715: Windows skip ontest_module_with_scriptlooks reasonableMarking this test with
@skip_on_windowsto avoid long-filename issues is consistent with other tests and avoids known Windows-specific path problems.src/snakemake/deployment/containerize.py (1)
32-51: Conda env spec classification refactor is soundComputing
spec_type = CondaEnvSpecType.from_spec(rule.conda_env)beforeinfer_source_file(...)keeps the FILE/NAME/DIR detection tied to the original spec (string/SourceFile/Path) and plays nicely with the newerSourceFilesubclasses. The subsequent FILE-only check and error path remain correct.src/snakemake/script/__init__.py (1)
660-664: Secret‑free script paths andget_sourcewildcard handling look correct
ScriptBase.local_pathnow usingself.path.get_path_or_uri(secret_free=True)is safe underassert self.is_localand avoids ever embedding credentials in path strings.- All language preambles (Python/R/RMarkdown/Julia/Rust/Bash) now derive
scriptdirviaget_path_or_uri(secret_free=True), which is exactly what you want for hosted/git-backed sources: actual I/O still goes throughSourceCacheand theSourceFileobjects, while human-visible paths are sanitized.get_source()’s new logic:
- Calls
path.format(wildcards=..., params=...)whenpathis aSourceFile, lettingLocalSourceFile,HostingProviderFile, etc. control how wildcards are applied.- Otherwise formats the string and then wraps it once via
infer_source_file.- Finally resolves
basedirand callsinfer_source_file(path, basedir)a single time.
This is a clean way to support the new hosted/git-backed types without double-wrapping or losing type information.No functional issues spotted in these areas.
Also applies to: 988-989, 1100-1102, 1186-1188, 1256-1257, 1554-1557, 1648-1659
src/snakemake/rules.py (3)
1219-1219: LGTM!The type assertion at line 1219 ensures
conda_envis of valid type before proceeding with spec type determination. This is good defensive programming that will catch issues early.
1233-1238: LGTM!The simplified constructor calls for
CondaEnvFileSpecandCondaEnvDirSpecalign well with the updated signatures inconda.py. The type assertion at line 1235 ensuresconda_envis a string before constructing aCondaEnvNameSpec.
1239-1242: LGTM!The explicit error handling for unsupported spec types (lines 1239-1240) is a good defensive measure. The updated
apply_wildcardscall (line 1242) correctly uses the simplified single-argument signature.src/snakemake/deployment/conda.py (6)
136-151: LGTM!The
_get_aux_filemethod now correctly uses thereplace_suffixmethod from theSourceFileinterface. The logic properly handles the case where the suffix replacement returnsNone(when the file doesn't end with.yamlor.yml), and the existence check viaself.workflow.sourcecache.exists(aux_file)is appropriate.
437-437: LGTM!Using
secret_free=Truewhen obtaining the path for local source files is the correct approach here, as this path is used for creating conda environments and should not contain any secrets.
883-888: LGTM!The simplified
CondaEnvFileSpecconstructor now directly accepts asource_fileparameter, and theapply_wildcardsmethod correctly uses theformatmethod of the source file with keyword argument unpacking.
918-926: LGTM!The
CondaEnvDirSpecconstructor andapply_wildcardsmethod are correctly updated. The path is stored as a string, and wildcard application properly handles local file URI normalization.
959-960: LGTM!The
apply_wildcardsmethod signature is correctly simplified to take onlywildcardsas an argument, consistent with the other spec types.
991-1009: LGTM!The
from_specmethod now properly distinguishes between FILE, DIR, and NAME spec types. The logic correctly:
- Checks for
.yaml/.ymlextensions to identify FILE specs- Uses
LocalSourceFilepath check to identify DIR specs- Falls back to NAME for other cases
src/snakemake/sourcecache.py (10)
38-58: LGTM!The helper functions
_format_or_noneand_replace_suffixare clean utilities that handle None values appropriately and support the new suffix replacement functionality across SourceFile subclasses.
61-122: LGTM!The
SourceFileABC is properly extended with new abstract methods (format,endswith,replace_suffix) that enable consistent suffix-based operations across all source file types. Thesimplify_pathmethod now has a sensible default implementation.
125-159: LGTM!Both
GenericSourceFileandLocalSourceFilecorrectly implement the new abstract methods (endswith,replace_suffix,format) using the shared_replace_suffixhelper. The implementations are consistent and properly handle None returns when suffix doesn't match.Also applies to: 161-210
212-256: LGTM!
LocalGitFilecorrectly implements the new abstract methods. Theformatmethod properly handles wildcards for all fields including the optionaltag,ref, andcommitusing_format_or_none.
421-443: LGTM!The
endswith,replace_suffix,check, andapply_wildcardsmethods forHostingProviderFileare properly implemented and maintain consistency with otherSourceFilesubclasses.
563-573: LGTM!The
GithubFileclass correctly sets a default host and implements theauthproperty to provide token-based authentication for Git operations.
582-592: LGTM!The
GitlabFileclass mirrors theGithubFilepattern with its ownauthproperty implementation.
650-650: LGTM!The runtime cache directory is appropriately renamed to
snakemake-runtime-cache.
682-685: LGTM!Setting
cache_pathonHostingProviderFileinstances before computing cache paths ensures the git-backed infrastructure is properly configured.
755-756: LGTM!The
_openmethod correctly routesHostingProviderFileinstances to use their ownopen()method for git-backed file access.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In @src/snakemake/sourcecache.py:
- Around line 472-488: The shared registry _hosted_repos in the hosted_repo
property is currently keyed only by self.repo, which can cause wrong
HostedGitRepo instances when different HostingProviderFile objects use the same
repo but different cache_path; change the registry key to include both repo and
cache_path (e.g., tuple of self.repo and str(self.cache_path)) or alternatively
validate that an existing entry for self.repo has the same cache_path before
returning it; update all accesses to _hosted_repos in the hosted_repo method
(lookup, assignment, and KeyError handling) to use the combined key or to
perform the cache_path consistency check and raise/replace accordingly so each
unique repo+cache_path gets its own HostedGitRepo.
- Line 374: The _hosted_repos registry is a class-level dictionary and should be
annotated as a typing.ClassVar to indicate it is shared across instances; update
the declaration of _hosted_repos to use ClassVar[Dict[str, HostedGitRepo]] and
add ClassVar to the imports from typing (or typing_extensions if used), ensuring
the symbol name _hosted_repos and the HostedGitRepo type are preserved.
- Around line 508-520: The open method in src/snakemake/sourcecache.py
references git.GitCommandError at runtime but git is only imported inside the
TYPE_CHECKING block, causing a NameError; fix by importing git for runtime
use—either add a top-level "import git" to the module imports or (preferred to
avoid startup cost) add a local "import git" at the start of the open method
(which references hosted_repo, ref, path, and git.GitCommandError) so the
exception class and git.show call are available at runtime.
- Around line 351-363: The current fetch method using shutil.copytree and
shutil.move around Repo(tmpdir).remotes.origin.fetch() is broken (tmpdir is
precreated, destination exists) and inefficient; replace the copy-fetch-move
approach with a direct git fetch against the existing working clone: open the
repository at self.repo_clone via Repo(self.repo_clone) and call
remotes.origin.fetch() with proper try/except to log failures via logger, and
add a fallback path that reclones into a temporary directory and atomically
swaps in only if self.repo_clone is missing or corrupted (use self._tmpdir for
the clone target and shutil.replace or os.replace to swap), removing
shutil.copytree/shutil.move usage to prevent failures and corruption.
- Around line 317-327: The clone block using self._tmpdir(), Repo.clone_from and
shutil.move is unsafe: moving a TemporaryDirectory-managed path prevents
cleanup, and shutil.move can fail if self.repo_clone is created concurrently;
replace the pattern by creating a distinct temporary directory path (not the
TemporaryDirectory's root) for the bare clone, clone into that path, then use
os.rename (or atomic os.replace/rename) to move the temp path into
self.repo_clone while catching FileExistsError to handle the race (if
destination exists, delete/cleanup the temp path and proceed to open existing
repo), and ensure the TemporaryDirectory/context is not moved so it can always
be cleaned up; update the logic around self._existed_before and setting
self._repo = Repo(self.repo_clone) accordingly.
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)
650-650: Runtime cache directory renamed.The runtime cache directory name changed from
"runtime-cache"to"snakemake-runtime-cache". This is a minor breaking change if users or scripts reference the old directory name directly.Consider documenting this change in migration notes or deprecation warnings if backward compatibility is important.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snakemake/sourcecache.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/sourcecache.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
🧬 Code graph analysis (1)
src/snakemake/sourcecache.py (7)
src/snakemake/parser.py (1)
parse(1391-1414)src/snakemake/utils.py (1)
format(523-567)src/snakemake/common/__init__.py (4)
LockFreeWritableFile(391-432)is_local_file(131-132)parse_uri(135-150)smart_join(153-170)src/snakemake/exceptions.py (1)
SourceFileError(200-202)src/snakemake/common/git.py (1)
split_git_path(7-14)src/snakemake/workflow.py (2)
check(645-653)is_local(714-718)src/snakemake/io/__init__.py (8)
check(468-471)check(953-1000)exists(473-477)parents(479-484)mtime(161-162)mtime(505-521)open(397-416)get(1913-1918)
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
337-337: Consider moving this statement to an else block
(TRY300)
347-347: Consider moving this statement to an else block
(TRY300)
374-374: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
478-482: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
478-482: Avoid specifying long messages outside the exception class
(TRY003)
518-520: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
518-520: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: apidocs
🔇 Additional comments (2)
src/snakemake/sourcecache.py (2)
444-453: Clean implementation of the format method.The format method correctly creates a new instance with formatted parameters and properly handles optional fields using
_format_or_none. The pattern is consistently applied across all SourceFile subclasses.
80-80: No action needed. The code correctly uses"typing.Self"as a string forward reference, which is compatible with any Python version and does not require importingtyping.Self. Type hints in string form are evaluated only by static type checkers, not at runtime, so there is no compatibility concern here. Additionally, the project requires Python 3.11 at runtime per the coding guidelines.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @src/snakemake/sourcecache.py:
- Around line 296-328: HostedGitRepo's lock-free cloning can race when multiple
threads clone the same repo: after Repo.clone_from into tmpdir, wrap
shutil.move(tmpdir, self.repo_clone) in a try/except to handle race conditions
(e.g., FileExistsError/OSError/shutil.Error); if the move fails because
self.repo_clone already exists, discard tmpdir (cleanup) and open the existing
repository with Repo(self.repo_clone) instead of failing, and ensure self._repo
is set to that Repo instance (and update _existed_before if appropriate);
perform these adjustments around the move and Repo(self.repo_clone) calls inside
the __init__ (use the same _tmpdir context and repo_clone symbol to locate the
code).
- Around line 375-376: The class-level mutable `_hosted_repos` dict should be
annotated as a ClassVar to signal it's shared across instances and satisfy
static type checkers: update the annotation of `_hosted_repos` to
`ClassVar[Dict[str, HostedGitRepo]] = {}` and add/import `ClassVar` from
`typing` if not already present; keep the existing variable name `_hosted_repos`
and the `HostedGitRepo` type reference unchanged so static analyzers understand
it's a class-level mutable.
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)
352-364: Consider optimizing concurrent fetch operations.Multiple threads calling
fetch()simultaneously will each copy the repository, perform a fetch, and move it back. While the finalshutil.move()will overwrite and leave the repository in a consistent state, this approach wastes resources with redundant fetch operations.Consider adding a flag or lock to prevent concurrent fetches of the same repository.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snakemake/sourcecache.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/sourcecache.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
🧬 Code graph analysis (1)
src/snakemake/sourcecache.py (5)
src/snakemake/utils.py (1)
format(523-567)src/snakemake/common/__init__.py (1)
LockFreeWritableFile(391-432)src/snakemake/exceptions.py (1)
SourceFileError(200-202)src/snakemake/common/git.py (1)
split_git_path(7-14)src/snakemake/io/__init__.py (4)
check(468-471)check(953-1000)exists(473-477)get(1913-1918)
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
338-338: Consider moving this statement to an else block
(TRY300)
348-348: Consider moving this statement to an else block
(TRY300)
375-375: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
480-484: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
480-484: Avoid specifying long messages outside the exception class
(TRY003)
520-522: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
520-522: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (45)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: apidocs
🔇 Additional comments (6)
src/snakemake/sourcecache.py (6)
12-14: LGTM! Clean additions to imports and helper utilities.The new imports and helper functions are well-structured:
TYPE_CHECKINGis used correctly for conditional imports_format_or_noneand_replace_suffixare useful utility functionsAlso applies to: 21-21, 32-32, 35-40, 55-59
130-146: LGTM! Consistent implementation across concrete SourceFile classes.The new methods are properly implemented across
GenericSourceFile,LocalSourceFile, andLocalGitFile:
- Path manipulation methods use the
_replace_suffixhelper correctlycheck()appropriately delegates tosnakemake.io.checkwhere applicableformat()properly handles all relevant fieldsAlso applies to: 166-182, 214-257
423-455: LGTM! Well-implemented git-backed hosting features.The additions to
HostingProviderFileare well-structured:
- Path manipulation methods properly propagate all required fields including
cache_pathmtime()efficiently retrieves commit timestampopen()method with fetch-on-demand is a good approach__str__()provides useful representationAlso applies to: 500-522, 525-528, 538-538, 554-554, 561-562
567-575: LGTM! Auth property implementations are correct.The
authproperty implementations in bothGithubFileandGitlabFileproperly handle authentication tokens for git operations.Also applies to: 590-594
652-652: LGTM! Clean integration of git-backed hosting in SourceCache.The changes properly integrate the new git-backed hosting features:
- Renamed runtime cache directory for clarity
- Cache path propagation to
HostingProviderFileinstances- Routing of
HostingProviderFileto its specializedopen()methodAlso applies to: 685-686, 757-758
80-81: No action needed. The use oftyping.Selfin string-literal form (forward references) is fully compatible with Snakemake's minimum Python 3.11+ runtime requirement and requires no changes.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @src/snakemake/sourcecache.py:
- Around line 511-525: The open() method assigns fetch_error from
hosted_repo.fetch() but never uses it, so if fetching fails the subsequent
hosted_repo.repo.git.show(...) will raise an uninformative error; update open()
to check the result of hosted_repo.fetch() (the fetch_error variable returned by
hosted_repo.fetch()) and if it indicates failure include its details in the
WorkflowError message raised when git.GitCommandError occurs (or raise a
WorkflowError immediately when fetch_error shows failure) so that failures from
hosted_repo.fetch() are surfaced; reference the open() method,
hosted_repo.ref_exists/ref_exists, hosted_repo.file_exists/file_exists,
hosted_repo.fetch()/fetch_error, hosted_repo.repo.git.show(...) and the
WorkflowError/generic exception handling to locate and change the logic.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snakemake/sourcecache.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/sourcecache.py
🧠 Learnings (5)
📚 Learning: 2026-01-08T11:13:20.523Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3643
File: src/snakemake/sourcecache.py:474-492
Timestamp: 2026-01-08T11:13:20.523Z
Learning: In Python files under src/snakemake, ensure that the per-run cache_path for source files does not change within a single workflow run, so that HostingProviderFile._hosted_repos can be keyed reliably by repository name. This invariant should be documented and enforced (e.g., by freezing cache_path after initialization, avoiding mutations during a run, and ensuring any caching uses a run-scoped key). This pattern applies to all modules in the Snakemake source cache area where per-run caching and repository hosting are implemented.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-08-27T20:15:58.699Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3026
File: snakemake/assets/__init__.py:0-0
Timestamp: 2024-08-27T20:15:58.699Z
Learning: Do not suggest type annotations for `cls` arguments in class methods when reviewing code for the Snakemake project.
Applied to files:
src/snakemake/sourcecache.py
🧬 Code graph analysis (1)
src/snakemake/sourcecache.py (4)
src/snakemake/utils.py (1)
format(523-567)src/snakemake/common/__init__.py (1)
LockFreeWritableFile(391-432)src/snakemake/common/git.py (1)
split_git_path(7-14)src/snakemake/io/__init__.py (3)
check(468-471)check(953-1000)get(1913-1918)
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
339-339: Consider moving this statement to an else block
(TRY300)
349-349: Consider moving this statement to an else block
(TRY300)
483-487: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
483-487: Avoid specifying long messages outside the exception class
(TRY003)
517-517: Local variable fetch_error is assigned to but never used
Remove assignment to unused variable fetch_error
(F841)
523-525: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
523-525: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (9, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: apidocs
🔇 Additional comments (7)
src/snakemake/sourcecache.py (7)
40-41: LGTM! Helper functions are well-designed.The
_format_or_noneand_replace_suffixutility functions correctly handle optional parameters and suffix replacement logic.Also applies to: 56-61
64-93: LGTM! Abstract method declarations enhance the SourceFile API.The new abstract methods (
check,format,endswith,replace_suffix) provide a consistent interface across all SourceFile implementations.
131-184: LGTM! Method implementations are correct and consistent.All SourceFile subclasses correctly implement the new abstract methods. The use of
_format_or_noneinLocalGitFile.format()properly handles optional parameters.Also applies to: 215-258
297-372: LGTM! Lock-free cloning strategy is sound.The
HostedGitRepoclass correctly implements lock-free repository cloning via temporary directory + atomic move. The bare clone approach is appropriate for read-only operations. Thefetch()method returns an error message string on failure andNoneon success, providing clear semantics for callers.
426-458: LGTM! Path manipulation and authentication methods are correctly implemented.The implementations of
endswith,replace_suffix,check,format,mtime,__str__, and theauthproperties inGithubFileandGitlabFileare all correct. Theformatmethod correctly propagatescache_paththrough to new instances.Also applies to: 503-509, 564-565, 574-597
687-689: LGTM! SourceCache correctly integrates with hosting provider files.The cache path propagation in
_cache_entry()(lines 688-689) and the routing ofHostingProviderFileto its ownopen()method (lines 760-761) correctly integrate the new git-backed hosting support.Also applies to: 760-761
476-495: Class-level caching relies on cache_path stability—confirmed by design.The
_hosted_reposclass-level cache is keyed by repository name alone, which safely relies oncache_pathbeing set once per workflow run and remaining stable thereafter. This invariant is maintained bySourceCache._cache_entry(line 689), which assignscache_pathexactly once when creating cache entries forHostingProviderFileinstances, and subsequent accesses only read the stored value.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/snakemake/sourcecache.py (1)
522-540: Consider consistent exception handling approach.The
open()method uses exception chaining (raise ... from e) at line 540, which differs from the exception handling pattern used elsewhere in this file (e.g., lines 771-774 for LocalGitFile). Based on learnings, exception chaining has traditionally been avoided in the Snakemake codebase for backward compatibility reasons.♻️ For consistency with existing patterns
- raise WorkflowError(msg) from e + raise WorkflowError(msg)tests/tests_using_conda.py (1)
6-6: Remove unused imports.The imports
partial(line 6) andHostingProviderFile(line 16) are not used anywhere in this test file and should be removed.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/snakemake/deployment/containerize.pysrc/snakemake/sourcecache.pytests/test_sourcecache.pytests/tests_using_conda.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
tests/tests_using_conda.pytests/test_sourcecache.pysrc/snakemake/sourcecache.pysrc/snakemake/deployment/containerize.py
🧠 Learnings (11)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3643
File: src/snakemake/sourcecache.py:474-492
Timestamp: 2026-01-08T11:13:20.523Z
Learning: In Snakemake, the `cache_path` for source files does not change within a single workflow run, ensuring that all `HostingProviderFile` instances in a run share the same cache path. This makes it safe for the `_hosted_repos` class variable in `HostingProviderFile` to be keyed only by repository name.
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Applied to files:
tests/tests_using_conda.pytests/test_sourcecache.py
📚 Learning: 2026-01-08T11:13:20.523Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3643
File: src/snakemake/sourcecache.py:474-492
Timestamp: 2026-01-08T11:13:20.523Z
Learning: In Snakemake, the `cache_path` for source files does not change within a single workflow run, ensuring that all `HostingProviderFile` instances in a run share the same cache path. This makes it safe for the `_hosted_repos` class variable in `HostingProviderFile` to be keyed only by repository name.
Applied to files:
tests/tests_using_conda.pytests/test_sourcecache.py
📚 Learning: 2025-01-13T18:50:40.791Z
Learnt from: cademirch
Repo: snakemake/snakemake PR: 3107
File: snakemake/logging.py:20-20
Timestamp: 2025-01-13T18:50:40.791Z
Learning: Dependencies for Snakemake are managed through test-environment.yml file, which includes snakemake-interface-logger-plugins (>= 0.1.0) and other required packages.
Applied to files:
tests/tests_using_conda.py
📚 Learning: 2026-01-08T11:13:20.523Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3643
File: src/snakemake/sourcecache.py:474-492
Timestamp: 2026-01-08T11:13:20.523Z
Learning: In Python files under src/snakemake, ensure that the per-run cache_path for source files does not change within a single workflow run, so that HostingProviderFile._hosted_repos can be keyed reliably by repository name. This invariant should be documented and enforced (e.g., by freezing cache_path after initialization, avoiding mutations during a run, and ensuring any caching uses a run-scoped key). This pattern applies to all modules in the Snakemake source cache area where per-run caching and repository hosting are implemented.
Applied to files:
src/snakemake/sourcecache.pysrc/snakemake/deployment/containerize.py
📚 Learning: 2024-10-29T09:26:26.636Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: snakemake/deployment/conda.py:0-0
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `snakemake/deployment/conda.py`, the exception handling in the `_get_version` method is intentional. The method raises a `WorkflowError` when the version cannot be determined, and this behavior is desired. Do not suggest modifying this exception handling.
Applied to files:
src/snakemake/sourcecache.pysrc/snakemake/deployment/containerize.py
📚 Learning: 2025-01-23T08:17:08.873Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3263
File: snakemake/sourcecache.py:195-196
Timestamp: 2025-01-23T08:17:08.873Z
Learning: Including authentication tokens in Git repository URLs is accepted in the Snakemake codebase when using the HostedGitRepo class, as secure authentication handling through GitPython's credential helpers is not feasible in this context.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-08-27T20:15:58.699Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3026
File: snakemake/assets/__init__.py:0-0
Timestamp: 2024-08-27T20:15:58.699Z
Learning: Do not suggest type annotations for `cls` arguments in class methods when reviewing code for the Snakemake project.
Applied to files:
src/snakemake/sourcecache.py
📚 Learning: 2024-10-11T13:12:35.827Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/deployment/conda.py:85-88
Timestamp: 2024-10-11T13:12:35.827Z
Learning: In the `snakemake/deployment/conda.py` file, within the `Env` class, validation of `env_dir` occurs later in the code, so it's unnecessary to validate `env_dir` in the `__init__` method.
Applied to files:
src/snakemake/deployment/containerize.py
📚 Learning: 2024-10-11T12:49:08.705Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3132
File: snakemake/rules.py:1105-1105
Timestamp: 2024-10-11T12:49:08.705Z
Learning: In `snakemake/rules.py`, within the `Rule` class, `self.basedir` is a custom class (not a `pathlib.Path` object) that has a `join()` method. Therefore, using `self.basedir.join(conda_env)` is appropriate in this context.
Applied to files:
src/snakemake/deployment/containerize.py
🧬 Code graph analysis (3)
tests/tests_using_conda.py (2)
src/snakemake/workflow.py (1)
sourcecache(502-503)src/snakemake/sourcecache.py (1)
HostingProviderFile(385-580)
tests/test_sourcecache.py (1)
src/snakemake/sourcecache.py (5)
GithubFile(583-599)GitlabFile(602-624)repo(381-382)open(522-540)open(685-689)
src/snakemake/deployment/containerize.py (2)
src/snakemake/deployment/conda.py (3)
CondaEnvSpecType(986-1009)from_spec(992-1009)CondaEnvFileSpec(882-914)src/snakemake/sourcecache.py (1)
infer_source_file(627-657)
🪛 Ruff (0.14.10)
src/snakemake/sourcecache.py
347-347: Consider moving this statement to an else block
(TRY300)
357-357: Consider moving this statement to an else block
(TRY300)
494-498: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
494-498: Avoid specifying long messages outside the exception class
(TRY003)
src/snakemake/deployment/containerize.py
43-47: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: apidocs
🔇 Additional comments (11)
src/snakemake/deployment/containerize.py (1)
41-50: LGTM! Improved control flow.The refactoring correctly determines the conda env spec type before inferring the source file, which is more efficient than the previous approach. The changes properly:
- Check the spec type first (line 41) to validate it's FILE-based
- Only call
infer_source_fileafter validation (line 49), avoiding unnecessary work for non-FILE specs- Align with the updated
CondaEnvFileSpecconstructor that takes only the source file parameter (line 50)The logic is sound and the API usage matches the updated signatures from
conda.py.src/snakemake/sourcecache.py (8)
6-6: LGTM!The new imports and helper functions are well-structured:
- TYPE_CHECKING pattern correctly avoids runtime git import
_format_or_noneand_replace_suffixhelpers reduce code duplication- All imports are utilized in the code
Also applies to: 13-15, 22-24, 34-34, 37-39, 41-42, 57-62
65-66: LGTM!The abstract method additions (check, format, endswith, replace_suffix) ensure consistent API across all SourceFile implementations. The use of
smart_joinin thejoinmethod centralizes path joining logic.Also applies to: 82-94, 99-101
132-148: LGTM!The implementations of the new abstract methods across GenericSourceFile, LocalSourceFile, and LocalGitFile are consistent and correct. Each class appropriately implements
endswith,replace_suffix,check, andformataccording to its specific needs.Also applies to: 168-184, 216-259
298-383: LGTM!The HostedGitRepo class implementation is well-designed with robust concurrent access handling:
- Atomic repository cloning using tmpdir and rename for race-free initialization
- Proper cleanup on exceptions during cloning
- Appropriate error handling in ref_exists, file_exists, and fetch methods
- Use of retry logic with exponential backoff for network operations
The static analysis hints (TRY300) about moving statements to else blocks are false positives—the early returns are intentional and improve readability.
437-469: LGTM!The new methods in HostingProviderFile are well-implemented:
endswith,replace_suffix,check, andformatfollow the pattern established in other SourceFile subclassesmtime()correctly retrieves the last commit daterefproperty provides clean access to the active reference__str__provides useful string representationAlso applies to: 514-520, 542-546, 579-580
471-506: LGTM!The properties are well-implemented:
authprovides extensibility for subclassescache_pathgetter/setter with appropriate validation and subdirectory structurehosted_repouses thread-safe lazy initialization with class-level caching- Host validation prevents common misconfiguration
The static analysis hints (B904, TRY003) at lines 494-498 are false positives:
- B904 suggests exception chaining, but this is raising a new exception, not re-raising
- TRY003 suggests shorter messages, but the detailed message aids debugging
Based on learnings, the cache_path design correctly assumes it remains constant during a workflow run.
583-599: LGTM!The
authproperty additions to GithubFile and GitlabFile are well-designed. They provide authentication tokens in the correct format for Git repository URLs while maintaining clean separation of concerns. The token formatting with "@" suffix integrates seamlessly with the HostedGitRepo URL construction.Also applies to: 602-624
670-670: LGTM!The SourceCache updates correctly integrate hosting-backed source files:
- Runtime cache directory naming is clear and consistent
cache_pathassignment in_cache_entryensures HostingProviderFile instances have their cache path set before use (as per learnings)_openmethod properly delegates toHostingProviderFile.open()for hosting-backed sourcesAlso applies to: 703-705, 775-776
tests/test_sourcecache.py (2)
1-4: LGTM!The import additions and test function rename are appropriate:
- New imports (Path, tempfile, patch, GithubFile) are all utilized
- Test function renamed to follow Python naming conventions
Also applies to: 7-12
14-25: Verify test requirements and consider mocking fetch behavior.This test makes actual network requests to GitHub, which could cause issues:
- Requires network connectivity to pass
- May fail if the repository is unavailable or changes
- Depends on external service reliability
Consider either:
- Adding a
@connecteddecorator (similar to other network-dependent tests in the codebase)- Mocking additional methods (fetch, ref_exists) to make this a true unit test
- Explicitly documenting this as an integration test
🔍 Suggested improvements
Option 1: Add @connected decorator
from .common import connected @connected def test_github_file_fetch(): # ... existing test codeOption 2: Mock fetch to make it a unit test
def test_github_file_fetch(): with patch("snakemake.sourcecache.HostedGitRepo.file_exists", return_value=False), \ patch("snakemake.sourcecache.HostedGitRepo.ref_exists", return_value=False), \ patch("snakemake.sourcecache.HostedGitRepo.fetch", return_value=None) as mock_fetch: with tempfile.TemporaryDirectory() as tmpdir: file = GithubFile( repo="snakemake/snakemake", path="README.md", branch="main", cache_path=Path(tmpdir), ) # This would require additional mocking of git.Repo to avoid actual network calls # ...
🤖 I have created a release *beep* *boop* --- ## [9.14.6](v9.14.5...v9.14.6) (2026-01-08) ### Bug Fixes * create local clone of git repos for source files from hosting providers ([#3643](#3643)) ([d2f8aba](d2f8aba)) * create potentially missing .snakemake folder in case of very long command lines for spawned jobs ([#3894](#3894)) ([4b431dd](4b431dd)) * make ilp solver enumeration lazy ([#3900](#3900)) ([30e1509](30e1509)) * Prevent broken report_href links by using deterministic report IDs with fixed prefix length ([#3889](#3889)) ([6d8f4d8](6d8f4d8)) * refactor LoggerManager setup and scope ([#3851](#3851)) ([f46d904](f46d904)) * yield proper error message in case a local git source file is not retrievable ([#3892](#3892)) ([ed79cae](ed79cae)) ### Documentation * explain how to pass nested config via CLI ([#3885](#3885)) ([9d8c539](9d8c539)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
|
Hi @johanneskoester, thank you for implementing this fix. I believe I am unfortunately seeing the error I originally reported, and I was wondering whether you could take a look. After updating Snakemake to 9.14.6 and testing with my repo (ezherman/minimal_example_snakemake_module, branch Inspecting I expected it to point to the local cached clone so Python could resolve imports like To summarize:
Is there a step I might be missing? Thanks in advance. |
Description
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.