Conversation
📝 WalkthroughWalkthroughUpdates CI Python matrix to 3.11–3.13; switch TOML parsing to stdlib tomllib; add split_code_string and rework string-alignment to mask/split/restore strings; update formatter typing and Black exception import; and adapt tests to use CliRunner stdout and new Black imports. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User/CLI
participant F as Formatter
participant S as split_code_string
participant B as Black (parsing)
U->>F: format(snakefile_text)
F->>S: split_code_string(code)
S-->>F: segments (pre, string, post)
rect rgba(220,235,255,0.35)
F->>F: mask string segments with placeholders
F->>F: indent/format non-string segments (params, parens)
F->>F: restore original string content
end
F->>B: run_black_format_str(python_blocks)
alt Black success
B-->>F: formatted python blocks
F-->>U: combined formatted Snakefile
else black.parsing.InvalidInput
B-->>F: raise parsing error
F-->>U: propagate error
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Possibly related PRs
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
snakefmt/config.py (1)
130-139: Bug: kebab-case skip- options aren’t normalized before slicing.*Keys like
skip-string-normalizationbecome_string_normalizationand get ignored. Normalize hyphens first and invert as bool.Apply this diff:
- for key, val in config.items(): - # this is due to FileMode param being string_normalise, but CLI being - # skip_string_normalise - https://github.com/snakemake/snakefmt/issues/73 - if key.startswith("skip"): - key = key[5:] - val = not val - - key = key.replace("-", "_") + for key, val in config.items(): + # Normalize kebab->snake first + key_norm = key.replace("-", "_") + # Handle skip_* options (invert semantics) + if key_norm.startswith("skip_"): + key = key_norm[5:] + val = not bool(val) + else: + key = key_norm
🧹 Nitpick comments (6)
snakefmt/config.py (1)
128-142: Optional: avoid mutating Black’s Mode in-place.Depending on Black version, Mode may be frozen; safer to accumulate updates and use dataclasses.replace.
- for key, val in config.items(): + updates = {} + for key, val in config.items(): ... - setattr(black_mode, key, val) + updates[key] = val + if updates: + from dataclasses import replace + black_mode = replace(black_mode, **updates)tests/test_snakefmt.py (1)
365-371: Minor: prefer comprehension over map+lambda for clarity.Not required, but reads cleaner.
- smks = get_snakefiles_in_dir( + smks = get_snakefiles_in_dir( path=abs_tmpdir, include=include, exclude=exclude, gitignore=gitignore, ) - snakefiles = list(map(lambda p: str(p.relative_to(abs_tmpdir)), smks)) + snakefiles = [str(p.relative_to(abs_tmpdir)) for p in smks]tests/test_formatter.py (2)
29-31: Nit: variable name “stream” is misleading (it’s a string, not a stream)Rename to snakecode for consistency with the rest of the file.
- stream = "rule a:\n" f'{TAB * 1}input: "foo.txt"' - formatter = setup_formatter(stream) + snakecode = "rule a:\n" f'{TAB * 1}input: "foo.txt"' + formatter = setup_formatter(snakecode)
292-321: Strong regression test for issue #190: very long strings inside nested Python in run:
- Idempotence assertion is the right oracle here.
- Consider parametrizing the long lengths (68, 117) to document why these numbers trigger the prior failure (token/line-length threshold).
- ' "' + ("a" * 68) + '",\n' + # length chosen to exercise prior breaking threshold (see #190) + ' "' + ("a" * 68) + '",\n' @@ - "' + ("b" * 117) + '",\n' + # length chosen to exercise prior breaking threshold (see #190) + "' + ("b" * 117) + '",\n'snakefmt/parser/syntax.py (2)
62-99: split_code_string: robust identification of multi-line string/f-string spans
- Correctly tracks nested FSTRING_START/END.
- Returns alternating [pre, string, post, ...] segments as designed.
Two small improvements:
- Combine nested ifs for STRING to satisfy SIM102.
- Add a short doc note that returned list alternates code and multi-line string blocks, starting with code.
- if token.type == tokenize.STRING: - if token.start[0] != token.end[0]: - string_areas.append((token.start, token.end)) + if token.type == tokenize.STRING and token.start[0] != token.end[0]: + string_areas.append((token.start, token.end))
101-110: _extract_line_mid: cross-line slice logic is subtle—add bounds assertionsTo prevent accidental off-by-one or negative indexing on malformed coordinates, assert start <= end and indices within bounds. Low overhead, safer.
def _extract_line_mid( lines: list[str], start: tuple[int, int], end: tuple[int, int] ) -> str: + # Defensive checks against malformed token coords + assert 1 <= start[0] <= end[0] <= len(lines), (start, end, len(lines)) + assert 0 <= start[1] <= len(lines[start[0] - 1]) + assert 0 <= end[1] <= len(lines[end[0] - 1]) s = "".join(lines[i] for i in range(start[0] - 1, end[0])) t = s[start[1] :] end_trim = end[1] - len(lines[end[0] - 1]) if end_trim != 0: t = t[:end_trim] return t
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (9)
.github/workflows/ci.yaml(1 hunks)snakefmt/__init__.py(2 hunks)snakefmt/config.py(3 hunks)snakefmt/formatter.py(5 hunks)snakefmt/parser/syntax.py(2 hunks)tests/conftest.py(1 hunks)tests/test_config.py(5 hunks)tests/test_formatter.py(6 hunks)tests/test_snakefmt.py(11 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
tests/conftest.pysnakefmt/config.pytests/test_snakefmt.pytests/test_formatter.pysnakefmt/parser/syntax.pytests/test_config.pysnakefmt/__init__.pysnakefmt/formatter.py
🧬 Code graph analysis (4)
tests/test_snakefmt.py (2)
tests/conftest.py (1)
cli_runner(6-7)snakefmt/snakefmt.py (2)
main(165-308)get_snakefiles_in_dir(41-76)
tests/test_formatter.py (2)
tests/__init__.py (1)
setup_formatter(9-13)snakefmt/formatter.py (1)
get_formatted(74-75)
snakefmt/parser/syntax.py (1)
snakefmt/types.py (1)
Token(7-11)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (1)
Snakefile(29-52)snakefmt/parser/syntax.py (1)
split_code_string(62-98)
🪛 Ruff (0.12.2)
snakefmt/parser/syntax.py
74-75: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build (3.13, ubuntu-latest)
🔇 Additional comments (27)
.github/workflows/ci.yaml (1)
17-17: Verify packaging metadata and CI dependencies
Matrix updated to Python 3.11–3.13—great. Please manually confirm the following, as I could not locate anypyproject.toml,setup.py, orsetup.cfgin this branch:
- Packaging metadata enforces Python >= 3.11:
•python_requires = ">=3.11"(or equivalent)
• Trove classifiers includeProgramming Language :: Python :: 3.11,3.12, and3.13- CI/test dependencies support 3.11+:
• Black’s minimum version inpyproject.toml(orrequirements-dev.txt)
• Click’s version constraint aligns with Python 3.11+- Update or add the metadata file at the project root if missing.
tests/conftest.py (1)
7-7: Confirmed: Click’s CliRunner still separates stderr by defaultVerified on Click 8.1.8 that
CliRunner.__init__signature includesmix_stderr: bool = TrueCliRunner.invokecontinues to accept separate stdout/stderr streamsNo changes to the fixture in tests/conftest.py are required.
snakefmt/config.py (3)
82-88: Binary read + tomllib.load — LGTM.Correct file mode and error surfacing via click.FileError.
121-126: Consistent tomllib usage in Black config — LGTM.Matches the stdlib parser and wraps parse errors in MalformattedToml.
5-5: Ensure project metadata is updated for stdlib tomllib and Python ≥3.11I wasn’t able to locate a
pyproject.toml,setup.cfg, orsetup.pyin the repo—please double-check that your project metadata file is present and updated:
- Remove the third-party
tomlentry from the[tool.poetry].dependencies(orinstall_requiresinsetup.cfg/setup.py).- Add or update the Python requirement to
python = ">=3.11"(orpython_requires = ">=3.11").- Verify that any CI configurations (e.g. GitHub Actions, tox.ini, etc.) and documentation are aligned with Python 3.11+ and don’t install the old
tomlpackage.tests/test_config.py (5)
23-23: Defaults aligned with Black — LGTM.
53-71: Stdout assertions updated — LGTM.Matches CliRunner behavior after fixture change.
87-90: Equality/inequality checks against stdin — LGTM.
259-259: Tomllib error text assertion — portable enough.The substring match “Invalid statement” should be stable across 3.11–3.13.
Please confirm CI on 3.11/3.12/3.13 shows identical message; if it diverges, relax to a broader regex.
5-5: Ensure Black’sconstModule Is AvailableThe new import in tests/test_config.py:
import black.constwill fail on Black versions prior to the introduction of the
constmodule. Please:• Verify which Black release first added
black.const(it’s available from Black 24.4.0 onward).
• Confirm your project’s declared Black dependency meets this minimum. For example, in setup.py or pyproject.toml:- install_requires = [ - "black", - … - ] + install_requires = [ + "black>=24.4.0,<25.0.0", + … + ]• If you need to support earlier Black versions, either:
– Add a fallback in tests (e.g., wrap the import intry/exceptand mockblack.const), or
– Update documentation to require Black 24.4.0+.tests/test_snakefmt.py (10)
18-18: Direct invoke with no args — LGTM.
53-53: Assert on stdout for stdin formatting — LGTM.
188-188: Diff stdout assertion — LGTM.
198-198: Empty stdout when error with --check --diff — LGTM.
219-219: Unified diff output assertion — LGTM.
241-241: Compact diff stdout assertion — LGTM.
262-262: Prefers compact diff when both flags set — LGTM.
287-287: No diff on syntax error — LGTM.
382-382: Counter() comparison simplified — LGTM.
9-9: Ensure Black is pinned andget_gitignoreis availableI didn’t find any dependency file (pyproject.toml, setup.cfg, requirements*.txt, or Pipfile) that pins Black in this repo. Please verify that:
- Black is declared as a development dependency in your project’s configuration.
- You’ve pinned to a Black release that includes the helper
black.files.get_gitignore.- Your CI/test environment installs that same Black version so the import
will succeed.from black.files import get_gitignoretests/test_formatter.py (3)
15-15: Good refactor: centralized formatter setup via tests.setup_formatterThis simplifies tests and reduces boilerplate. Nice.
209-229: Test expectation correctly captures expand(...) spacing normalizationThe expected value matches the formatter’s normalization (no spaces around keyword args and trailing commas). Looks good.
262-275: Nice regression guard for lambda with “input” identifier and nested bracketsCovers the tricky tokenizer/parameter split edge case; assertion against exact identity is appropriate.
snakefmt/__init__.py (2)
15-16: Version-gated f-string tokeniser flag is clear and correctClean single source of truth used by parser/syntax; no action needed.
19-23: Confirm Black supports the new PY313 targetI couldn’t locate where Black is pinned in this repo (no pyproject.toml, setup.cfg, requirements.txt or poetry.lock entry) and Black isn’t installed in the checked-out environment, so we can’t automatically verify that
TargetVersion.PY313is available. Please double-check in your dependency configuration (e.g.pyproject.toml, Poetry settings, CI image, etc.) that you’re using a Black release which definesblack.mode.TargetVersion.PY313before merging.snakefmt/parser/syntax.py (1)
160-172: add_token_space: None-safe prev_token handling is correctSignature and guard now match usage sites (first token in a param). Good.
snakefmt/formatter.py (1)
194-217: Exception path updated to black.parsing.InvalidInputGood catch with Black’s API; error mapping to InvalidPython remains intact.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
snakefmt/formatter.py (1)
7-7: Black submodule import is fine; ensure dependency is declared (already raised).Using
import black.parsingmakesblack.format_straccessible via theblackpackage binding. Just ensure Black is a runtime dependency in packaging. This was flagged previously; no further action here.
🧹 Nitpick comments (3)
snakefmt/formatter.py (3)
193-216: Harden Black exception compatibility across versions.Catching
black.parsing.InvalidInputmay break on environments where the class is exposed asblack.InvalidInput. Use a version-agnostic alias and catch that.Apply this diff within this hunk:
- except black.parsing.InvalidInput as e: + except BlackInvalidInput as e:Add this alias near the top-level imports (outside this hunk):
# version-agnostic InvalidInput alias try: from black.parsing import InvalidInput as BlackInvalidInput # old/existing except Exception: # pragma: no cover from black import InvalidInput as BlackInvalidInput # fallback
229-243: String-alignment mask/restore approach addresses the edge case; add a sanity check.The
.strip()on the indented delimiter is necessary given mid-line delimiters infakewrap(per prior discussion/learning). Add a quick assert to fail fast if the split count ever desynchronizes.Apply this diff:
- split_code = fakewrap.split(textwrap.indent('"""\n"""', used_indent).strip()) + split_code = fakewrap.split(textwrap.indent('"""\n"""', used_indent).strip()) + assert len(split_code) == (len(split_string) + 1) // 2, ( + "align_strings: masking split mismatch", + len(split_code), + len(split_string), + )Please confirm we have a regression test that reproduces issue #190 (nested prints, very long strings/f-strings inside a run: block) and passes with this implementation.
327-337: Avoid double-iteratingparameters.all_params; cache once to ensure stability.If
all_paramsis a generator-like iterable, callingiter(...)and then looping again can consume elements. Cache to a tuple, use first element for inline check, and reuse for the loop.Apply this diff:
- inline_fmting = p_class is InlineSingleParam + inline_fmting = p_class is InlineSingleParam + params = tuple(parameters.all_params) ... - param = next(iter(parameters.all_params)) + param = params[0] ... - result = f"{prepended_comments}{result} {param_result}" + result = f"{prepended_comments}{result} {param_result}" else: - result = f"{result}{parameters.comment}\n" - for param in parameters.all_params: + result = f"{result}{parameters.comment}\n" + for param in params: result += self.format_param( param, target_indent, inline_fmting, param_list )Confirm
parameters.all_paramsis re-iterable today; if not, the above change is required to prevent missing params when inlining falls back to multiline.Also applies to: 345-351
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
snakefmt/formatter.py(6 hunks)tests/test_formatter.py(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_formatter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.
Applied to files:
snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (2)
Snakefile(29-52)comment_start(55-56)snakefmt/parser/syntax.py (1)
split_code_string(62-98)
🔇 Additional comments (2)
snakefmt/formatter.py (2)
12-23: Parser/TAB/split_code_string imports look correct and align with new flow.The move to public
Snakefile,Parser,comment_start, plus importingTABandsplit_code_stringmatches the refactor.
59-59: Constructor type update to Snakefile is appropriate.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
snakefmt/formatter.py(7 hunks)tests/test_formatter.py(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_formatter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.
Applied to files:
snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (4)
snakefmt/logging.py (2)
Warnings(34-46)comment_relocation(41-46)snakefmt/parser/parser.py (2)
Snakefile(29-52)comment_start(55-56)snakefmt/parser/syntax.py (4)
split_code_string(62-98)InlineSingleParam(572-574)all_params(441-442)line_nb(303-304)snakefmt/types.py (1)
line_nb(14-15)
🔇 Additional comments (8)
snakefmt/formatter.py (8)
12-12: LGTM! Clean import organizationThe import of
Snakefilefrom the parser module properly reflects the architectural change whereFormatternow accepts aSnakefileobject instead ofTokenIterator.
21-21: Good addition of thesplit_code_stringutilityThe import of
split_code_stringaligns with the refactored string alignment approach that aims to fix the edge cases mentioned in the PR objectives.
59-59: Type annotation improvementThe parameter type change from
TokenIteratortoSnakefileis a good architectural improvement that provides better type safety and clearer API boundaries.
193-193: Error handling correctly updatedThe exception handling has been properly updated to catch
black.parsing.InvalidInputinstead ofblack.InvalidInput, maintaining consistency with the import change at line 7.
229-243: Excellent refactor of string alignment logicThis is a significant improvement that addresses the core issue. The new approach using
split_code_stringis more robust:
- Handles single-piece strings efficiently with direct indentation
- Uses a masking approach for multi-piece strings which should prevent the formatting issues with nested expressions and long strings mentioned in the PR objectives
- The
.strip()call on line 237 is correctly used to match the delimiter pattern in the masked contentThis change directly addresses the issues from #190, #208, and #240 by providing a safer fallback mechanism.
267-267: Good: wrapping in function call for safe parsingWrapping the parameter value in
f({val}\n)provides a safe context for Black to parse complex expressions, which should help prevent the issues with lambda and parenthesized expressions mentioned in #208 and #240.
325-343: Critical improvement for inline formatting logicThe refactored inline formatting logic is a key fix:
- It now pre-formats the parameter to check if it fits on one line (line 331-334)
- Falls back to multi-line formatting when the parameter is too complex
- This directly addresses the lambda and long expression issues from #208 and #240 by providing a proper fallback mechanism
This "try inline, fallback to normal" approach is exactly what was needed to fix the edge cases.
345-349: Verify parameter formatting consistency after refactorI didn’t find any standalone fixtures or snapshot files in
.txt,.smake/.smk, or Snakefile formats that document the old parameter formatting, but I did spot two Python test cases that hard-code example parameters with lambdas:
- A
time_min=lambda wildcards, attempt: (…)case- A
mem_mb=lambda wildcards, attempt, mem=1000: attempt * memcasePlease run the full test suite (e.g.
pytest) and confirm that the formatter’s output for these scenarios remains byte-for-byte identical to what the tests expect. If you see any diffs in those lambda-parameter cases, update the tests’ expected strings or tweak the formatter to preserve the original line breaks and commas exactly.
There was a problem hiding this comment.
Thanks for the PR @Hocnonsense. Please see two comments on the tests you added. As you say this PR will fix #208 too, could you please also add a test using the example from that issue please?
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/test_formatter.py (2)
29-31: Rename variable for clarity: it's code, not a stream.
Minor readability nit—use snakecode consistently across tests.- stream = "rule a:\n" f'{TAB * 1}input: "foo.txt"' - formatter = setup_formatter(stream) + snakecode = "rule a:\n" f'{TAB * 1}input: "foo.txt"' + formatter = setup_formatter(snakecode)
292-343: Comprehensive regression for long/nested calls in run blocks (issue #190).
Expected normalization (indent, trailing commas, quote style) looks right.Consider asserting idempotence on the normalized output as well:
actual = setup_formatter(snakecode).get_formatted() assert actual == expected + assert setup_formatter(expected).get_formatted() == expected
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/test_formatter.py(8 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
tests/test_formatter.py
🧬 Code graph analysis (1)
tests/test_formatter.py (2)
tests/__init__.py (1)
setup_formatter(9-13)snakefmt/formatter.py (1)
get_formatted(74-75)
🔇 Additional comments (6)
tests/test_formatter.py (6)
15-15: Importing setup_formatter from tests is correct and simplifies the harness.
Keeps tests decoupled from internal constructors.
209-216: expand(...) regression looks good.
Covers spacing normalization and trailing comma; matches objectives.
262-275: Lambda with input keyword and nested parentheses — idempotence verified.
Solid coverage for the lambda parsing edge case.
344-362: Inline param formatting with comments/parentheses — good coverage (issues #208, #240, #242).
Idempotence assertion is appropriate here.
1221-1223: Whitespace before comment normalized — expectation is correct.
Confirms comment reflow between repeated keywords.
541-541: All mock_m.call_count assertions updated to 3 — no remaining tests expect 2.
- Verified across all test files with:
(no matches found)rg -nP '\bassert\s+mock_m\.call_count\s*==\s*2\b' -g 'tests/**.py' -C2- No further action needed.
|
Oh, I have miss something, sorry and I'll have a look. Sorry for my misunderstanding. @mbhall88 Now the test is added. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #259 +/- ##
==========================================
- Coverage 97.53% 96.43% -1.10%
==========================================
Files 12 12
Lines 1095 1122 +27
Branches 251 217 -34
==========================================
+ Hits 1068 1082 +14
- Misses 14 22 +8
- Partials 13 18 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/test_formatter.py (1)
208-229: Assertion pattern matches suite convention.Using explicit expected here follows the earlier guidance to assert either identity or a concrete expected string.
🧹 Nitpick comments (2)
tests/test_formatter.py (2)
29-31: Name the variable snakecode for consistency.Most tests now use snakecode; align this one for readability.
- stream = "rule a:\n" f'{TAB * 1}input: "foo.txt"' - formatter = setup_formatter(stream) + snakecode = "rule a:\n" f'{TAB * 1}input: "foo.txt"' + formatter = setup_formatter(snakecode)
571-571: Avoid brittle assertion on exact Black invocation count.Internal segmentation may change; assert semantics instead of an exact count.
- assert mock_m.call_count == 3 + assert mock_m.call_count >= 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/test_formatter.py(8 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
tests/test_formatter.py
🧬 Code graph analysis (1)
tests/test_formatter.py (2)
tests/__init__.py (1)
setup_formatter(9-13)snakefmt/formatter.py (1)
get_formatted(74-75)
🔇 Additional comments (5)
tests/test_formatter.py (5)
15-15: Good move to centralize the test harness with setup_formatter.Reduces boilerplate and keeps formatter config consistent across tests.
262-275: Adds regression for lambda with input keyword and nested parentheses (issue #208).Covers the parser ambiguity and ensures no unintended reflow.
344-371: Inline lambda long-line fallback to multiline looks correct (issue #208).Validates the formatter’s “one-line → multiline” safety net.
372-392: Good coverage for comments inside parenthesized params and nested max(...) (issues #240 and #242).Exercising the Black-handled merge behavior without breaking syntax.
1251-1254: Comment normalization between repeated keywords looks right.Trims stray leading space and preserves intended spacing rules.
|
Thanks @Hocnonsense this looks awesome. |
|
Sure! I've removed it. |
|
I think it is fine to be merged now! |
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
pyproject.tomlis excluded by!pyproject.toml
📒 Files selected for processing (1)
snakefmt/formatter.py(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.
Applied to files:
snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (2)
Snakefile(29-52)comment_start(55-56)snakefmt/parser/syntax.py (1)
split_code_string(62-98)
🔇 Additional comments (4)
snakefmt/formatter.py (4)
12-23: Imports refactor looks goodBringing
Snakefileinto annotations and importingsplit_code_string/TABhere is appropriate and aligns this module with the new helpers.
59-60: Type annotation update LGTMAnnotating
snakefilewithSnakefileimproves clarity and matches the parser API.
267-273: Trailing whitespace trimming: confirm non-ParamList path
.rstrip()now runs only wheninline_formattingorparam_listis true. For single-parameter, non-list cases withinline_formatting=False, trailing spaces could slip through. Black usually trims them, but please verify.You can add tests that format a non-ParamList parameter (e.g.,
shell:with a long literal/comment) and assert no trailing spaces on any line of the output. If it fails, consider reverting to unconditionalval = val.rstrip()before Black formatting.
347-353: Inline vs multiline flow looks correct
- Relocating pre-comments only when inlining and warning via
Warnings.comment_relocationis consistent with the PR goals.- The multiline fallback loops over all params via
format_param, satisfying the “safe multiline” requirement.
| from typing import Optional | ||
|
|
||
| import black | ||
| import black.parsing |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Make Black import/exception handling version-agnostic and explicit
Relying on import black.parsing to populate the black name is fragile, and InvalidInput’s location has varied across Black versions. Import black explicitly and alias InvalidInput with a compatibility fallback; then catch the alias.
Apply this diff:
@@
-import black.parsing
+import black
+try: # Black >= 24.x
+ from black.parsing import InvalidInput as BlackInvalidInput
+except Exception: # Black < 24.x
+ from black import InvalidInput as BlackInvalidInput
@@
- except black.parsing.InvalidInput as e:
+ except BlackInvalidInput as e:Also applies to: 193-193
🤖 Prompt for AI Agents
In snakefmt/formatter.py at lines 7 and 193, the code imports black.parsing and
catches InvalidInput directly, which is brittle across Black versions; instead
add an explicit import for black and create a compatibility alias for
InvalidInput (try to import InvalidInput from black, fall back to
black.parsing.InvalidInput), then update the exception handling at line 193 to
catch that alias; ensure both the top-of-file import and the except clause
reference the new alias so the import/exception handling is version-agnostic.
| split_string = split_code_string(string) | ||
| if len(split_string) == 1: | ||
| return textwrap.indent(split_string[0], used_indent) | ||
| # First, masks all multi-line strings | ||
| mask_string = "`~!@#$%^&*|?" | ||
| while mask_string in string: | ||
| mask_string += mask_string | ||
| mask_string = f'"""{mask_string}"""' | ||
| fakewrap = textwrap.indent( | ||
| "".join(mask_string if i % 2 else s for i, s in enumerate(split_string)), | ||
| used_indent, | ||
| ) | ||
| split_code = fakewrap.split(mask_string) | ||
| # After indenting, we puts those strings back | ||
| indented = "".join( | ||
| s.replace("\t", TAB) if i % 2 else split_code[i // 2] | ||
| for i, s in enumerate(split_string) | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Do not mutate string literal contents; strengthen masking
align_strings currently replaces tabs inside multi-line string tokens (s.replace("\t", TAB)), altering user string contents (including f-strings). That’s a semantic change and can break legitimate data. Also, make the mask uniqueness check against the final triple-quoted sentinel and add a sanity check on the split.
Apply this diff:
@@
- # First, masks all multi-line strings
- mask_string = "`~!@#$%^&*|?"
- while mask_string in string:
- mask_string += mask_string
- mask_string = f'"""{mask_string}"""'
+ # First, mask all multi-line strings with a triple-quoted sentinel that cannot collide
+ mask_content = "`~!@#$%^&*|?"
+ while f'"""{mask_content}"""' in string:
+ mask_content += mask_content
+ mask_string = f'"""{mask_content}"""'
@@
- split_code = fakewrap.split(mask_string)
+ split_code = fakewrap.split(mask_string)
+ assert len(split_code) == (len(split_string) + 1) // 2, (
+ "align_strings: masking split mismatch",
+ len(split_code),
+ len(split_string),
+ )
@@
- indented = "".join(
- s.replace("\t", TAB) if i % 2 else split_code[i // 2]
- for i, s in enumerate(split_string)
- )
+ indented = "".join(
+ (s if i % 2 else split_code[i // 2]) for i, s in enumerate(split_string)
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| split_string = split_code_string(string) | |
| if len(split_string) == 1: | |
| return textwrap.indent(split_string[0], used_indent) | |
| # First, masks all multi-line strings | |
| mask_string = "`~!@#$%^&*|?" | |
| while mask_string in string: | |
| mask_string += mask_string | |
| mask_string = f'"""{mask_string}"""' | |
| fakewrap = textwrap.indent( | |
| "".join(mask_string if i % 2 else s for i, s in enumerate(split_string)), | |
| used_indent, | |
| ) | |
| split_code = fakewrap.split(mask_string) | |
| # After indenting, we puts those strings back | |
| indented = "".join( | |
| s.replace("\t", TAB) if i % 2 else split_code[i // 2] | |
| for i, s in enumerate(split_string) | |
| ) | |
| split_string = split_code_string(string) | |
| if len(split_string) == 1: | |
| return textwrap.indent(split_string[0], used_indent) | |
| # First, mask all multi-line strings with a triple-quoted sentinel that cannot collide | |
| mask_content = "`~!@#$%^&*|?" | |
| while f'"""{mask_content}"""' in string: | |
| mask_content += mask_content | |
| mask_string = f'"""{mask_content}"""' | |
| fakewrap = textwrap.indent( | |
| "".join(mask_string if i % 2 else s for i, s in enumerate(split_string)), | |
| used_indent, | |
| ) | |
| split_code = fakewrap.split(mask_string) | |
| assert len(split_code) == (len(split_string) + 1) // 2, ( | |
| "align_strings: masking split mismatch", | |
| len(split_code), | |
| len(split_string), | |
| ) | |
| indented = "".join( | |
| (s if i % 2 else split_code[i // 2]) for i, s in enumerate(split_string) | |
| ) |
suggestion from @coderabbitai[bot] Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (4)
snakefmt/formatter.py (4)
7-7: Make Black import/exception handling version-agnosticCatching black.parsing.InvalidInput is brittle across Black versions. Import Black explicitly and alias the exception with a fallback.
Apply this diff:
-import black.parsing +import black +try: # Black >= 24.x + from black.parsing import InvalidInput as BlackInvalidInput +except Exception: # Older Black + from black import InvalidInput as BlackInvalidInput @@ - except black.parsing.InvalidInput as e: + except BlackInvalidInput as e:Also applies to: 193-193
268-271: Trailing whitespace trimming now conditional—add coverageSince rstrip() no longer runs unconditionally, add tests ensuring no trailing spaces leak when inline_formatting=False and param_list=False, and when wrapping via the temporary f(…) call.
333-344: Inline fallback and empty-params guard: good fixThe StopIteration guard and measuring single-line via count("\n") are pragmatic and solve the lambda/long-value edge cases.
244-246: Do not mutate multi-line string literal contentsReplacing tabs inside user string tokens changes data (including f-strings, embedded scripts). Preserve string segments verbatim.
- indented = "".join( - s.replace("\t", TAB) if i % 2 else split_code[i // 2] - for i, s in enumerate(split_string) - ) + indented = "".join( + (s if i % 2 else split_code[i // 2]) for i, s in enumerate(split_string) + )
🧹 Nitpick comments (1)
snakefmt/formatter.py (1)
229-243: String masking/indent flow is solid; add safety checks and tighter uniquenessGreat switch to split_code_string + masking. Add a sanity check to detect split mismatches and check uniqueness against the final triple-quoted sentinel to avoid unnecessary growth.
- mask_string = "`~!@#$%^&*|?" - while mask_string in string: - mask_string += mask_string - mask_string = f'"""{mask_string}"""' + mask_string = "`~!@#$%^&*|?" + while f'"""{mask_string}"""' in string: + mask_string += mask_string + mask_string = f'"""{mask_string}"""' @@ - split_code = fakewrap.split(mask_string) + split_code = fakewrap.split(mask_string) + assert len(split_code) == (len(split_string) + 1) // 2, ( + "align_strings: masking split mismatch", + len(split_code), + len(split_string), + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
snakefmt/formatter.py(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.
Applied to files:
snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (2)
Snakefile(29-52)comment_start(55-56)snakefmt/parser/syntax.py (1)
split_code_string(62-98)
🔇 Additional comments (5)
snakefmt/formatter.py (5)
12-12: Imports of Parser/Snakefile/split_code_string/TAB look correctThese align with the new formatter flow and public API exposure.
Also applies to: 21-21, 23-23
59-59: Constructor type annotation: LGTMUsing Snakefile here improves clarity without affecting runtime.
329-329: InlineSingleParam detection is clearDirect class check is appropriate here.
352-352: Comment relocation when inlining: LGTMWarns appropriately and preserves readability.
354-358: Multiline path over all params: LGTMIteration + delegating to format_param keeps behavior consistent.
| Code for searching for and parsing snakefmt configuration files | ||
| """ | ||
|
|
||
| import tomllib |
There was a problem hiding this comment.
This will also require an update in the dependencies in snakefmt's pyproject.toml
will fix #190.
and also fix #208 and fix #240, now the "one-line" format will fall back to normal style,
and also fix #242, just let black to merge lines and handle comments.
and also close #255
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests