Skip to content

fix: edge cases when snakefmt break snakefile#259

Merged
mbhall88 merged 11 commits intomasterfrom
fix-190
Aug 31, 2025
Merged

fix: edge cases when snakefmt break snakefile#259
mbhall88 merged 11 commits intomasterfrom
fix-190

Conversation

@Hocnonsense
Copy link
Copy Markdown
Contributor

@Hocnonsense Hocnonsense commented Aug 27, 2025

will fix #190.

and also fix #208 and fix #240, now the "one-line" format will fall back to normal style,
and also fix #242, just let black to merge lines and handle comments.
and also close #255

Summary by CodeRabbit

  • New Features

    • Safer handling of multi-piece strings and f-strings for more robust formatting.
    • More compact and stable inline parameter formatting.
  • Bug Fixes

    • Improved consistency of indentation and alignment for complex strings and parameters.
    • Updated parsing error messages for clearer diagnostics.
  • Chores

    • Updated supported Python targets to 3.11–3.13; CI matrix adjusted.
    • Switched config parsing to Python's stdlib TOML reader.
  • Tests

    • CLI tests now assert stdout; added regression tests for complex parameter cases.

@Hocnonsense Hocnonsense requested a review from mbhall88 as a code owner August 27, 2025 05:57
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Aug 27, 2025

📝 Walkthrough

Walkthrough

Updates CI Python matrix to 3.11–3.13; switch TOML parsing to stdlib tomllib; add split_code_string and rework string-alignment to mask/split/restore strings; update formatter typing and Black exception import; and adapt tests to use CliRunner stdout and new Black imports.

Changes

Cohort / File(s) Summary
CI matrix update
\.github/workflows/ci.yaml
Update tested Python versions: remove 3.9/3.10, test 3.11,3.12,3.13.
Versioning & targets
snakefmt/__init__.py
Use unconditional importlib.metadata; import TargetVersion from black.mode; update DEFAULT_TARGET_VERSIONS to PY311–PY313; add fstring_tokeniser_in_use flag.
Config parsing via stdlib
snakefmt/config.py
Replace third-party toml with stdlib tomllib; use tomllib.load(open(..., "rb")); switch exception handling to tomllib.TOMLDecodeError.
Formatter refactor
snakefmt/formatter.py
Replace TokenIterator typing with Snakefile; import split_code_string; remove regex-centric full-string matcher; rework align_strings to split/mask/indent/restore string segments; catch black.parsing.InvalidInput; adjust inline-parameter formatting flow.
Parser utilities & spacing
snakefmt/parser/syntax.py
Add split_code_string and helper _extract_line_mid; change add_token_space to accept Optional[Token]; use fstring_tokeniser_in_use for f-string tokenization.
Tests: runner & config
tests/conftest.py, tests/test_config.py
Use default CliRunner() and assert on stdout; import black.const.DEFAULT_LINE_LENGTH; update TOML error message expectations.
Tests: formatter
tests/test_formatter.py
Replace StringIO/Snakefile-based harness with string-based setup_formatter inputs; rename vars to snakecode; add regression/idempotence tests for long args/parentheses and inline param comments.
Tests: CLI & diffs
tests/test_snakefmt.py
Use inline CLI args; assert on stdout; update expected diff quoting/format; import get_gitignore from black.files.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User/CLI
  participant F as Formatter
  participant S as split_code_string
  participant B as Black (parsing)

  U->>F: format(snakefile_text)
  F->>S: split_code_string(code)
  S-->>F: segments (pre, string, post)
  rect rgba(220,235,255,0.35)
    F->>F: mask string segments with placeholders
    F->>F: indent/format non-string segments (params, parens)
    F->>F: restore original string content
  end
  F->>B: run_black_format_str(python_blocks)
  alt Black success
    B-->>F: formatted python blocks
    F-->>U: combined formatted Snakefile
  else black.parsing.InvalidInput
    B-->>F: raise parsing error
    F-->>U: propagate error
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Assessment against linked issues

Objective Addressed Explanation
Fix formatting break with nested control flow and very long strings (#190) No explicit test or example matching the exact nested-if/for + long-string case is present to confirm the regression is resolved.
Fix misformatting of lambda functions in rule settings (#208) Inline parameter rendering and param-result handling were changed to avoid improper breaking of lambda parameter expressions.
Fix long parenthesis within rule settings that breaks code (#240) Parenthesis/parameter rendering logic was adjusted and related tests added, but the exact failing example from the issue was not included to fully verify the fix.
Handle comments in multi-line parenthesis without breaking Black (#242) Token spacing and TOML/spacing handling were modified, but no specific test for comments inside multi-line parentheses was added to confirm behavior.

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
CI Python matrix update (.github/workflows/ci.yaml) CI maintenance change unrelated to the formatting/parsing bug fixes requested in the linked issues.
DEFAULT_TARGET_VERSIONS / fstring flag change (snakefmt/__init__.py) Target-version and feature-flag updates are compatibility/maintenance changes not required by the linked bug reports.

Possibly related PRs

Suggested reviewers

  • mbhall88

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-190

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
snakefmt/config.py (1)

130-139: Bug: kebab-case skip- options aren’t normalized before slicing.*

Keys like skip-string-normalization become _string_normalization and get ignored. Normalize hyphens first and invert as bool.

Apply this diff:

-    for key, val in config.items():
-        # this is due to FileMode param being string_normalise, but CLI being
-        # skip_string_normalise - https://github.com/snakemake/snakefmt/issues/73
-        if key.startswith("skip"):
-            key = key[5:]
-            val = not val
-
-        key = key.replace("-", "_")
+    for key, val in config.items():
+        # Normalize kebab->snake first
+        key_norm = key.replace("-", "_")
+        # Handle skip_* options (invert semantics)
+        if key_norm.startswith("skip_"):
+            key = key_norm[5:]
+            val = not bool(val)
+        else:
+            key = key_norm
🧹 Nitpick comments (6)
snakefmt/config.py (1)

128-142: Optional: avoid mutating Black’s Mode in-place.

Depending on Black version, Mode may be frozen; safer to accumulate updates and use dataclasses.replace.

-    for key, val in config.items():
+    updates = {}
+    for key, val in config.items():
         ...
-        setattr(black_mode, key, val)
+        updates[key] = val
+    if updates:
+        from dataclasses import replace
+        black_mode = replace(black_mode, **updates)
tests/test_snakefmt.py (1)

365-371: Minor: prefer comprehension over map+lambda for clarity.

Not required, but reads cleaner.

-            smks = get_snakefiles_in_dir(
+            smks = get_snakefiles_in_dir(
                 path=abs_tmpdir,
                 include=include,
                 exclude=exclude,
                 gitignore=gitignore,
             )
-            snakefiles = list(map(lambda p: str(p.relative_to(abs_tmpdir)), smks))
+            snakefiles = [str(p.relative_to(abs_tmpdir)) for p in smks]
tests/test_formatter.py (2)

29-31: Nit: variable name “stream” is misleading (it’s a string, not a stream)

Rename to snakecode for consistency with the rest of the file.

-        stream = "rule a:\n" f'{TAB * 1}input: "foo.txt"'
-        formatter = setup_formatter(stream)
+        snakecode = "rule a:\n" f'{TAB * 1}input: "foo.txt"'
+        formatter = setup_formatter(snakecode)

292-321: Strong regression test for issue #190: very long strings inside nested Python in run:

  • Idempotence assertion is the right oracle here.
  • Consider parametrizing the long lengths (68, 117) to document why these numbers trigger the prior failure (token/line-length threshold).
-            '                    "' + ("a" * 68) + '",\n'
+            # length chosen to exercise prior breaking threshold (see #190)
+            '                    "' + ("a" * 68) + '",\n'
@@
-                        "' + ("b" * 117) + '",\n'
+                        # length chosen to exercise prior breaking threshold (see #190)
+                        "' + ("b" * 117) + '",\n'
snakefmt/parser/syntax.py (2)

62-99: split_code_string: robust identification of multi-line string/f-string spans

  • Correctly tracks nested FSTRING_START/END.
  • Returns alternating [pre, string, post, ...] segments as designed.

Two small improvements:

  • Combine nested ifs for STRING to satisfy SIM102.
  • Add a short doc note that returned list alternates code and multi-line string blocks, starting with code.
-        if token.type == tokenize.STRING:
-            if token.start[0] != token.end[0]:
-                string_areas.append((token.start, token.end))
+        if token.type == tokenize.STRING and token.start[0] != token.end[0]:
+            string_areas.append((token.start, token.end))

101-110: _extract_line_mid: cross-line slice logic is subtle—add bounds assertions

To prevent accidental off-by-one or negative indexing on malformed coordinates, assert start <= end and indices within bounds. Low overhead, safer.

 def _extract_line_mid(
     lines: list[str], start: tuple[int, int], end: tuple[int, int]
 ) -> str:
+    # Defensive checks against malformed token coords
+    assert 1 <= start[0] <= end[0] <= len(lines), (start, end, len(lines))
+    assert 0 <= start[1] <= len(lines[start[0] - 1])
+    assert 0 <= end[1] <= len(lines[end[0] - 1])
     s = "".join(lines[i] for i in range(start[0] - 1, end[0]))
     t = s[start[1] :]
     end_trim = end[1] - len(lines[end[0] - 1])
     if end_trim != 0:
         t = t[:end_trim]
     return t
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 059e4ae and 9b87bad.

📒 Files selected for processing (9)
  • .github/workflows/ci.yaml (1 hunks)
  • snakefmt/__init__.py (2 hunks)
  • snakefmt/config.py (3 hunks)
  • snakefmt/formatter.py (5 hunks)
  • snakefmt/parser/syntax.py (2 hunks)
  • tests/conftest.py (1 hunks)
  • tests/test_config.py (5 hunks)
  • tests/test_formatter.py (6 hunks)
  • tests/test_snakefmt.py (11 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • tests/conftest.py
  • snakefmt/config.py
  • tests/test_snakefmt.py
  • tests/test_formatter.py
  • snakefmt/parser/syntax.py
  • tests/test_config.py
  • snakefmt/__init__.py
  • snakefmt/formatter.py
🧬 Code graph analysis (4)
tests/test_snakefmt.py (2)
tests/conftest.py (1)
  • cli_runner (6-7)
snakefmt/snakefmt.py (2)
  • main (165-308)
  • get_snakefiles_in_dir (41-76)
tests/test_formatter.py (2)
tests/__init__.py (1)
  • setup_formatter (9-13)
snakefmt/formatter.py (1)
  • get_formatted (74-75)
snakefmt/parser/syntax.py (1)
snakefmt/types.py (1)
  • Token (7-11)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (1)
  • Snakefile (29-52)
snakefmt/parser/syntax.py (1)
  • split_code_string (62-98)
🪛 Ruff (0.12.2)
snakefmt/parser/syntax.py

74-75: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build (3.13, ubuntu-latest)
🔇 Additional comments (27)
.github/workflows/ci.yaml (1)

17-17: Verify packaging metadata and CI dependencies
Matrix updated to Python 3.11–3.13—great. Please manually confirm the following, as I could not locate any pyproject.toml, setup.py, or setup.cfg in this branch:

  • Packaging metadata enforces Python >= 3.11:
    python_requires = ">=3.11" (or equivalent)
    • Trove classifiers include Programming Language :: Python :: 3.11, 3.12, and 3.13
  • CI/test dependencies support 3.11+:
    • Black’s minimum version in pyproject.toml (or requirements-dev.txt)
    • Click’s version constraint aligns with Python 3.11+
  • Update or add the metadata file at the project root if missing.
tests/conftest.py (1)

7-7: Confirmed: Click’s CliRunner still separates stderr by default

Verified on Click 8.1.8 that

  • CliRunner.__init__ signature includes mix_stderr: bool = True
  • CliRunner.invoke continues to accept separate stdout/stderr streams

No changes to the fixture in tests/conftest.py are required.

snakefmt/config.py (3)

82-88: Binary read + tomllib.load — LGTM.

Correct file mode and error surfacing via click.FileError.


121-126: Consistent tomllib usage in Black config — LGTM.

Matches the stdlib parser and wraps parse errors in MalformattedToml.


5-5: Ensure project metadata is updated for stdlib tomllib and Python ≥3.11

I wasn’t able to locate a pyproject.toml, setup.cfg, or setup.py in the repo—please double-check that your project metadata file is present and updated:

  • Remove the third-party toml entry from the [tool.poetry].dependencies (or install_requires in setup.cfg/setup.py).
  • Add or update the Python requirement to python = ">=3.11" (or python_requires = ">=3.11").
  • Verify that any CI configurations (e.g. GitHub Actions, tox.ini, etc.) and documentation are aligned with Python 3.11+ and don’t install the old toml package.
tests/test_config.py (5)

23-23: Defaults aligned with Black — LGTM.


53-71: Stdout assertions updated — LGTM.

Matches CliRunner behavior after fixture change.


87-90: Equality/inequality checks against stdin — LGTM.


259-259: Tomllib error text assertion — portable enough.

The substring match “Invalid statement” should be stable across 3.11–3.13.

Please confirm CI on 3.11/3.12/3.13 shows identical message; if it diverges, relax to a broader regex.


5-5: Ensure Black’s const Module Is Available

The new import in tests/test_config.py:

import black.const

will fail on Black versions prior to the introduction of the const module. Please:

• Verify which Black release first added black.const (it’s available from Black 24.4.0 onward).
• Confirm your project’s declared Black dependency meets this minimum. For example, in setup.py or pyproject.toml:

- install_requires = [
-     "black",
-
- ]
+ install_requires = [
+     "black>=24.4.0,<25.0.0",
+
+ ]

• If you need to support earlier Black versions, either:
– Add a fallback in tests (e.g., wrap the import in try/except and mock black.const), or
– Update documentation to require Black 24.4.0+.

tests/test_snakefmt.py (10)

18-18: Direct invoke with no args — LGTM.


53-53: Assert on stdout for stdin formatting — LGTM.


188-188: Diff stdout assertion — LGTM.


198-198: Empty stdout when error with --check --diff — LGTM.


219-219: Unified diff output assertion — LGTM.


241-241: Compact diff stdout assertion — LGTM.


262-262: Prefers compact diff when both flags set — LGTM.


287-287: No diff on syntax error — LGTM.


382-382: Counter() comparison simplified — LGTM.


9-9: Ensure Black is pinned and get_gitignore is available

I didn’t find any dependency file (pyproject.toml, setup.cfg, requirements*.txt, or Pipfile) that pins Black in this repo. Please verify that:

  • Black is declared as a development dependency in your project’s configuration.
  • You’ve pinned to a Black release that includes the helper black.files.get_gitignore.
  • Your CI/test environment installs that same Black version so the import
    from black.files import get_gitignore
    will succeed.
tests/test_formatter.py (3)

15-15: Good refactor: centralized formatter setup via tests.setup_formatter

This simplifies tests and reduces boilerplate. Nice.


209-229: Test expectation correctly captures expand(...) spacing normalization

The expected value matches the formatter’s normalization (no spaces around keyword args and trailing commas). Looks good.


262-275: Nice regression guard for lambda with “input” identifier and nested brackets

Covers the tricky tokenizer/parameter split edge case; assertion against exact identity is appropriate.

snakefmt/__init__.py (2)

15-16: Version-gated f-string tokeniser flag is clear and correct

Clean single source of truth used by parser/syntax; no action needed.


19-23: Confirm Black supports the new PY313 target

I couldn’t locate where Black is pinned in this repo (no pyproject.toml, setup.cfg, requirements.txt or poetry.lock entry) and Black isn’t installed in the checked-out environment, so we can’t automatically verify that TargetVersion.PY313 is available. Please double-check in your dependency configuration (e.g. pyproject.toml, Poetry settings, CI image, etc.) that you’re using a Black release which defines black.mode.TargetVersion.PY313 before merging.

snakefmt/parser/syntax.py (1)

160-172: add_token_space: None-safe prev_token handling is correct

Signature and guard now match usage sites (first token in a param). Good.

snakefmt/formatter.py (1)

194-217: Exception path updated to black.parsing.InvalidInput

Good catch with Black’s API; error mapping to InvalidPython remains intact.

@Hocnonsense Hocnonsense changed the title fix: edge cases when intending python code with strings fix: edge cases when snakefmt break snakefile Aug 27, 2025
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
snakefmt/formatter.py (1)

7-7: Black submodule import is fine; ensure dependency is declared (already raised).

Using import black.parsing makes black.format_str accessible via the black package binding. Just ensure Black is a runtime dependency in packaging. This was flagged previously; no further action here.

🧹 Nitpick comments (3)
snakefmt/formatter.py (3)

193-216: Harden Black exception compatibility across versions.

Catching black.parsing.InvalidInput may break on environments where the class is exposed as black.InvalidInput. Use a version-agnostic alias and catch that.

Apply this diff within this hunk:

-        except black.parsing.InvalidInput as e:
+        except BlackInvalidInput as e:

Add this alias near the top-level imports (outside this hunk):

# version-agnostic InvalidInput alias
try:
    from black.parsing import InvalidInput as BlackInvalidInput  # old/existing
except Exception:  # pragma: no cover
    from black import InvalidInput as BlackInvalidInput  # fallback

229-243: String-alignment mask/restore approach addresses the edge case; add a sanity check.

The .strip() on the indented delimiter is necessary given mid-line delimiters in fakewrap (per prior discussion/learning). Add a quick assert to fail fast if the split count ever desynchronizes.

Apply this diff:

-        split_code = fakewrap.split(textwrap.indent('"""\n"""', used_indent).strip())
+        split_code = fakewrap.split(textwrap.indent('"""\n"""', used_indent).strip())
+        assert len(split_code) == (len(split_string) + 1) // 2, (
+            "align_strings: masking split mismatch",
+            len(split_code),
+            len(split_string),
+        )

Please confirm we have a regression test that reproduces issue #190 (nested prints, very long strings/f-strings inside a run: block) and passes with this implementation.


327-337: Avoid double-iterating parameters.all_params; cache once to ensure stability.

If all_params is a generator-like iterable, calling iter(...) and then looping again can consume elements. Cache to a tuple, use first element for inline check, and reuse for the loop.

Apply this diff:

-        inline_fmting = p_class is InlineSingleParam
+        inline_fmting = p_class is InlineSingleParam
+        params = tuple(parameters.all_params)
...
-            param = next(iter(parameters.all_params))
+            param = params[0]
...
-            result = f"{prepended_comments}{result} {param_result}"
+            result = f"{prepended_comments}{result} {param_result}"
         else:
-            result = f"{result}{parameters.comment}\n"
-            for param in parameters.all_params:
+            result = f"{result}{parameters.comment}\n"
+            for param in params:
                 result += self.format_param(
                     param, target_indent, inline_fmting, param_list
                 )

Confirm parameters.all_params is re-iterable today; if not, the above change is required to prevent missing params when inlining falls back to multiline.

Also applies to: 345-351

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9b87bad and c9d7ebe.

📒 Files selected for processing (2)
  • snakefmt/formatter.py (6 hunks)
  • tests/test_formatter.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_formatter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.

Applied to files:

  • snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (2)
  • Snakefile (29-52)
  • comment_start (55-56)
snakefmt/parser/syntax.py (1)
  • split_code_string (62-98)
🔇 Additional comments (2)
snakefmt/formatter.py (2)

12-23: Parser/TAB/split_code_string imports look correct and align with new flow.

The move to public Snakefile, Parser, comment_start, plus importing TAB and split_code_string matches the refactor.


59-59: Constructor type update to Snakefile is appropriate.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c9d7ebe and ee4190e.

📒 Files selected for processing (2)
  • snakefmt/formatter.py (7 hunks)
  • tests/test_formatter.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_formatter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.

Applied to files:

  • snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (4)
snakefmt/logging.py (2)
  • Warnings (34-46)
  • comment_relocation (41-46)
snakefmt/parser/parser.py (2)
  • Snakefile (29-52)
  • comment_start (55-56)
snakefmt/parser/syntax.py (4)
  • split_code_string (62-98)
  • InlineSingleParam (572-574)
  • all_params (441-442)
  • line_nb (303-304)
snakefmt/types.py (1)
  • line_nb (14-15)
🔇 Additional comments (8)
snakefmt/formatter.py (8)

12-12: LGTM! Clean import organization

The import of Snakefile from the parser module properly reflects the architectural change where Formatter now accepts a Snakefile object instead of TokenIterator.


21-21: Good addition of the split_code_string utility

The import of split_code_string aligns with the refactored string alignment approach that aims to fix the edge cases mentioned in the PR objectives.


59-59: Type annotation improvement

The parameter type change from TokenIterator to Snakefile is a good architectural improvement that provides better type safety and clearer API boundaries.


193-193: Error handling correctly updated

The exception handling has been properly updated to catch black.parsing.InvalidInput instead of black.InvalidInput, maintaining consistency with the import change at line 7.


229-243: Excellent refactor of string alignment logic

This is a significant improvement that addresses the core issue. The new approach using split_code_string is more robust:

  1. Handles single-piece strings efficiently with direct indentation
  2. Uses a masking approach for multi-piece strings which should prevent the formatting issues with nested expressions and long strings mentioned in the PR objectives
  3. The .strip() call on line 237 is correctly used to match the delimiter pattern in the masked content

This change directly addresses the issues from #190, #208, and #240 by providing a safer fallback mechanism.


267-267: Good: wrapping in function call for safe parsing

Wrapping the parameter value in f({val}\n) provides a safe context for Black to parse complex expressions, which should help prevent the issues with lambda and parenthesized expressions mentioned in #208 and #240.


325-343: Critical improvement for inline formatting logic

The refactored inline formatting logic is a key fix:

  1. It now pre-formats the parameter to check if it fits on one line (line 331-334)
  2. Falls back to multi-line formatting when the parameter is too complex
  3. This directly addresses the lambda and long expression issues from #208 and #240 by providing a proper fallback mechanism

This "try inline, fallback to normal" approach is exactly what was needed to fix the edge cases.


345-349: Verify parameter formatting consistency after refactor

I didn’t find any standalone fixtures or snapshot files in .txt, .smake/.smk, or Snakefile formats that document the old parameter formatting, but I did spot two Python test cases that hard-code example parameters with lambdas:

  • A time_min=lambda wildcards, attempt: (…) case
  • A mem_mb=lambda wildcards, attempt, mem=1000: attempt * mem case

Please run the full test suite (e.g. pytest) and confirm that the formatter’s output for these scenarios remains byte-for-byte identical to what the tests expect. If you see any diffs in those lambda-parameter cases, update the tests’ expected strings or tweak the formatter to preserve the original line breaks and commas exactly.

Copy link
Copy Markdown
Member

@mbhall88 mbhall88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Hocnonsense. Please see two comments on the tests you added. As you say this PR will fix #208 too, could you please also add a test using the example from that issue please?

@Hocnonsense
Copy link
Copy Markdown
Contributor Author

Hocnonsense commented Aug 28, 2025

Sure, now expected in the new test cases are either formatted or have an explicit expected.
#208 shared a testcase with #208/#240

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/test_formatter.py (2)

29-31: Rename variable for clarity: it's code, not a stream.
Minor readability nit—use snakecode consistently across tests.

-        stream = "rule a:\n" f'{TAB * 1}input: "foo.txt"'
-        formatter = setup_formatter(stream)
+        snakecode = "rule a:\n" f'{TAB * 1}input: "foo.txt"'
+        formatter = setup_formatter(snakecode)

292-343: Comprehensive regression for long/nested calls in run blocks (issue #190).
Expected normalization (indent, trailing commas, quote style) looks right.

Consider asserting idempotence on the normalized output as well:

         actual = setup_formatter(snakecode).get_formatted()
         assert actual == expected
+        assert setup_formatter(expected).get_formatted() == expected
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ee4190e and e872995.

📒 Files selected for processing (1)
  • tests/test_formatter.py (8 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • tests/test_formatter.py
🧬 Code graph analysis (1)
tests/test_formatter.py (2)
tests/__init__.py (1)
  • setup_formatter (9-13)
snakefmt/formatter.py (1)
  • get_formatted (74-75)
🔇 Additional comments (6)
tests/test_formatter.py (6)

15-15: Importing setup_formatter from tests is correct and simplifies the harness.
Keeps tests decoupled from internal constructors.


209-216: expand(...) regression looks good.
Covers spacing normalization and trailing comma; matches objectives.


262-275: Lambda with input keyword and nested parentheses — idempotence verified.
Solid coverage for the lambda parsing edge case.


344-362: Inline param formatting with comments/parentheses — good coverage (issues #208, #240, #242).
Idempotence assertion is appropriate here.


1221-1223: Whitespace before comment normalized — expectation is correct.
Confirms comment reflow between repeated keywords.


541-541: All mock_m.call_count assertions updated to 3 — no remaining tests expect 2.

  • Verified across all test files with:
    rg -nP '\bassert\s+mock_m\.call_count\s*==\s*2\b' -g 'tests/**.py' -C2
    (no matches found)
  • No further action needed.

@mbhall88
Copy link
Copy Markdown
Member

mbhall88 commented Aug 29, 2025

Sure, now expected in the new test cases are either formatted or have an explicit expected. #208 shared a testcase with #208/#240

No, #208 has a complex lambda statement and I don't see a lambda statement in that test. Would you mind adding a separate test for the example in that issue?

@Hocnonsense
Copy link
Copy Markdown
Contributor Author

Hocnonsense commented Aug 29, 2025

Oh, I have miss something, sorry and I'll have a look.


Sorry for my misunderstanding. @mbhall88 Now the test is added.

@codecov
Copy link
Copy Markdown

codecov bot commented Aug 29, 2025

Codecov Report

❌ Patch coverage is 90.78947% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.43%. Comparing base (4a92aea) to head (43d2f1d).
⚠️ Report is 29 commits behind head on master.

Files with missing lines Patch % Lines
snakefmt/formatter.py 86.20% 3 Missing and 1 partial ⚠️
snakefmt/parser/syntax.py 92.10% 1 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #259      +/-   ##
==========================================
- Coverage   97.53%   96.43%   -1.10%     
==========================================
  Files          12       12              
  Lines        1095     1122      +27     
  Branches      251      217      -34     
==========================================
+ Hits         1068     1082      +14     
- Misses         14       22       +8     
- Partials       13       18       +5     
Flag Coverage Δ
unittests 96.25% <90.78%> (-1.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
snakefmt/__init__.py 100.00% <100.00%> (+20.00%) ⬆️
snakefmt/config.py 94.44% <100.00%> (+0.15%) ⬆️
snakefmt/parser/syntax.py 96.84% <92.10%> (-2.15%) ⬇️
snakefmt/formatter.py 95.91% <86.20%> (-2.96%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/test_formatter.py (1)

208-229: Assertion pattern matches suite convention.

Using explicit expected here follows the earlier guidance to assert either identity or a concrete expected string.

🧹 Nitpick comments (2)
tests/test_formatter.py (2)

29-31: Name the variable snakecode for consistency.

Most tests now use snakecode; align this one for readability.

-        stream = "rule a:\n" f'{TAB * 1}input: "foo.txt"'
-        formatter = setup_formatter(stream)
+        snakecode = "rule a:\n" f'{TAB * 1}input: "foo.txt"'
+        formatter = setup_formatter(snakecode)

571-571: Avoid brittle assertion on exact Black invocation count.

Internal segmentation may change; assert semantics instead of an exact count.

-            assert mock_m.call_count == 3
+            assert mock_m.call_count >= 2
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3c68e7b and 0c6552e.

📒 Files selected for processing (1)
  • tests/test_formatter.py (8 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • tests/test_formatter.py
🧬 Code graph analysis (1)
tests/test_formatter.py (2)
tests/__init__.py (1)
  • setup_formatter (9-13)
snakefmt/formatter.py (1)
  • get_formatted (74-75)
🔇 Additional comments (5)
tests/test_formatter.py (5)

15-15: Good move to centralize the test harness with setup_formatter.

Reduces boilerplate and keeps formatter config consistent across tests.


262-275: Adds regression for lambda with input keyword and nested parentheses (issue #208).

Covers the parser ambiguity and ensures no unintended reflow.


344-371: Inline lambda long-line fallback to multiline looks correct (issue #208).

Validates the formatter’s “one-line → multiline” safety net.


372-392: Good coverage for comments inside parenthesized params and nested max(...) (issues #240 and #242).

Exercising the Black-handled merge behavior without breaking syntax.


1251-1254: Comment normalization between repeated keywords looks right.

Trims stray leading space and preserves intended spacing rules.

@mbhall88
Copy link
Copy Markdown
Member

Thanks @Hocnonsense this looks awesome.
could you do me one last favour and remove the toml library from the pyproject.toml now that we’re using the standard library tomllib?

@Hocnonsense
Copy link
Copy Markdown
Contributor Author

Hocnonsense commented Aug 29, 2025

Sure! I've removed it.

@Hocnonsense
Copy link
Copy Markdown
Contributor Author

I think it is fine to be merged now!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0c6552e and 844840e.

⛔ Files ignored due to path filters (1)
  • pyproject.toml is excluded by !pyproject.toml
📒 Files selected for processing (1)
  • snakefmt/formatter.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.

Applied to files:

  • snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (2)
  • Snakefile (29-52)
  • comment_start (55-56)
snakefmt/parser/syntax.py (1)
  • split_code_string (62-98)
🔇 Additional comments (4)
snakefmt/formatter.py (4)

12-23: Imports refactor looks good

Bringing Snakefile into annotations and importing split_code_string/TAB here is appropriate and aligns this module with the new helpers.


59-60: Type annotation update LGTM

Annotating snakefile with Snakefile improves clarity and matches the parser API.


267-273: Trailing whitespace trimming: confirm non-ParamList path

.rstrip() now runs only when inline_formatting or param_list is true. For single-parameter, non-list cases with inline_formatting=False, trailing spaces could slip through. Black usually trims them, but please verify.

You can add tests that format a non-ParamList parameter (e.g., shell: with a long literal/comment) and assert no trailing spaces on any line of the output. If it fails, consider reverting to unconditional val = val.rstrip() before Black formatting.


347-353: Inline vs multiline flow looks correct

  • Relocating pre-comments only when inlining and warning via Warnings.comment_relocation is consistent with the PR goals.
  • The multiline fallback loops over all params via format_param, satisfying the “safe multiline” requirement.

from typing import Optional

import black
import black.parsing
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make Black import/exception handling version-agnostic and explicit

Relying on import black.parsing to populate the black name is fragile, and InvalidInput’s location has varied across Black versions. Import black explicitly and alias InvalidInput with a compatibility fallback; then catch the alias.

Apply this diff:

@@
-import black.parsing
+import black
+try:  # Black >= 24.x
+    from black.parsing import InvalidInput as BlackInvalidInput
+except Exception:  # Black < 24.x
+    from black import InvalidInput as BlackInvalidInput
@@
-        except black.parsing.InvalidInput as e:
+        except BlackInvalidInput as e:

Also applies to: 193-193

🤖 Prompt for AI Agents
In snakefmt/formatter.py at lines 7 and 193, the code imports black.parsing and
catches InvalidInput directly, which is brittle across Black versions; instead
add an explicit import for black and create a compatibility alias for
InvalidInput (try to import InvalidInput from black, fall back to
black.parsing.InvalidInput), then update the exception handling at line 193 to
catch that alias; ensure both the top-of-file import and the except clause
reference the new alias so the import/exception handling is version-agnostic.

Comment on lines +229 to +246
split_string = split_code_string(string)
if len(split_string) == 1:
return textwrap.indent(split_string[0], used_indent)
# First, masks all multi-line strings
mask_string = "`~!@#$%^&*|?"
while mask_string in string:
mask_string += mask_string
mask_string = f'"""{mask_string}"""'
fakewrap = textwrap.indent(
"".join(mask_string if i % 2 else s for i, s in enumerate(split_string)),
used_indent,
)
split_code = fakewrap.split(mask_string)
# After indenting, we puts those strings back
indented = "".join(
s.replace("\t", TAB) if i % 2 else split_code[i // 2]
for i, s in enumerate(split_string)
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Do not mutate string literal contents; strengthen masking

align_strings currently replaces tabs inside multi-line string tokens (s.replace("\t", TAB)), altering user string contents (including f-strings). That’s a semantic change and can break legitimate data. Also, make the mask uniqueness check against the final triple-quoted sentinel and add a sanity check on the split.

Apply this diff:

@@
-        # First, masks all multi-line strings
-        mask_string = "`~!@#$%^&*|?"
-        while mask_string in string:
-            mask_string += mask_string
-        mask_string = f'"""{mask_string}"""'
+        # First, mask all multi-line strings with a triple-quoted sentinel that cannot collide
+        mask_content = "`~!@#$%^&*|?"
+        while f'"""{mask_content}"""' in string:
+            mask_content += mask_content
+        mask_string = f'"""{mask_content}"""'
@@
-        split_code = fakewrap.split(mask_string)
+        split_code = fakewrap.split(mask_string)
+        assert len(split_code) == (len(split_string) + 1) // 2, (
+            "align_strings: masking split mismatch",
+            len(split_code),
+            len(split_string),
+        )
@@
-        indented = "".join(
-            s.replace("\t", TAB) if i % 2 else split_code[i // 2]
-            for i, s in enumerate(split_string)
-        )
+        indented = "".join(
+            (s if i % 2 else split_code[i // 2]) for i, s in enumerate(split_string)
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
split_string = split_code_string(string)
if len(split_string) == 1:
return textwrap.indent(split_string[0], used_indent)
# First, masks all multi-line strings
mask_string = "`~!@#$%^&*|?"
while mask_string in string:
mask_string += mask_string
mask_string = f'"""{mask_string}"""'
fakewrap = textwrap.indent(
"".join(mask_string if i % 2 else s for i, s in enumerate(split_string)),
used_indent,
)
split_code = fakewrap.split(mask_string)
# After indenting, we puts those strings back
indented = "".join(
s.replace("\t", TAB) if i % 2 else split_code[i // 2]
for i, s in enumerate(split_string)
)
split_string = split_code_string(string)
if len(split_string) == 1:
return textwrap.indent(split_string[0], used_indent)
# First, mask all multi-line strings with a triple-quoted sentinel that cannot collide
mask_content = "`~!@#$%^&*|?"
while f'"""{mask_content}"""' in string:
mask_content += mask_content
mask_string = f'"""{mask_content}"""'
fakewrap = textwrap.indent(
"".join(mask_string if i % 2 else s for i, s in enumerate(split_string)),
used_indent,
)
split_code = fakewrap.split(mask_string)
assert len(split_code) == (len(split_string) + 1) // 2, (
"align_strings: masking split mismatch",
len(split_code),
len(split_string),
)
indented = "".join(
(s if i % 2 else split_code[i // 2]) for i, s in enumerate(split_string)
)

suggestion from @coderabbitai[bot]

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
snakefmt/formatter.py (4)

7-7: Make Black import/exception handling version-agnostic

Catching black.parsing.InvalidInput is brittle across Black versions. Import Black explicitly and alias the exception with a fallback.

Apply this diff:

-import black.parsing
+import black
+try:  # Black >= 24.x
+    from black.parsing import InvalidInput as BlackInvalidInput
+except Exception:  # Older Black
+    from black import InvalidInput as BlackInvalidInput
@@
-        except black.parsing.InvalidInput as e:
+        except BlackInvalidInput as e:

Also applies to: 193-193


268-271: Trailing whitespace trimming now conditional—add coverage

Since rstrip() no longer runs unconditionally, add tests ensuring no trailing spaces leak when inline_formatting=False and param_list=False, and when wrapping via the temporary f(…) call.


333-344: Inline fallback and empty-params guard: good fix

The StopIteration guard and measuring single-line via count("\n") are pragmatic and solve the lambda/long-value edge cases.


244-246: Do not mutate multi-line string literal contents

Replacing tabs inside user string tokens changes data (including f-strings, embedded scripts). Preserve string segments verbatim.

-        indented = "".join(
-            s.replace("\t", TAB) if i % 2 else split_code[i // 2]
-            for i, s in enumerate(split_string)
-        )
+        indented = "".join(
+            (s if i % 2 else split_code[i // 2]) for i, s in enumerate(split_string)
+        )
🧹 Nitpick comments (1)
snakefmt/formatter.py (1)

229-243: String masking/indent flow is solid; add safety checks and tighter uniqueness

Great switch to split_code_string + masking. Add a sanity check to detect split mismatches and check uniqueness against the final triple-quoted sentinel to avoid unnecessary growth.

-        mask_string = "`~!@#$%^&*|?"
-        while mask_string in string:
-            mask_string += mask_string
-        mask_string = f'"""{mask_string}"""'
+        mask_string = "`~!@#$%^&*|?"
+        while f'"""{mask_string}"""' in string:
+            mask_string += mask_string
+        mask_string = f'"""{mask_string}"""'
@@
-        split_code = fakewrap.split(mask_string)
+        split_code = fakewrap.split(mask_string)
+        assert len(split_code) == (len(split_string) + 1) // 2, (
+            "align_strings: masking split mismatch",
+            len(split_code),
+            len(split_string),
+        )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 844840e and 43d2f1d.

📒 Files selected for processing (1)
  • snakefmt/formatter.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakefmt/formatter.py
🧠 Learnings (1)
📚 Learning: 2025-08-27T06:27:25.344Z
Learnt from: Hocnonsense
PR: snakemake/snakefmt#259
File: snakefmt/formatter.py:229-243
Timestamp: 2025-08-27T06:27:25.344Z
Learning: In snakefmt's align_strings method, the .strip() call on textwrap.indent('"""\n"""', used_indent) is necessary when splitting fakewrap because the delimiter appears mid-line (after code segments) in the masked content, not at line boundaries. Without .strip(), the delimiter won't match the actual pattern in fakewrap and the split will fail.

Applied to files:

  • snakefmt/formatter.py
🧬 Code graph analysis (1)
snakefmt/formatter.py (2)
snakefmt/parser/parser.py (2)
  • Snakefile (29-52)
  • comment_start (55-56)
snakefmt/parser/syntax.py (1)
  • split_code_string (62-98)
🔇 Additional comments (5)
snakefmt/formatter.py (5)

12-12: Imports of Parser/Snakefile/split_code_string/TAB look correct

These align with the new formatter flow and public API exposure.

Also applies to: 21-21, 23-23


59-59: Constructor type annotation: LGTM

Using Snakefile here improves clarity without affecting runtime.


329-329: InlineSingleParam detection is clear

Direct class check is appropriate here.


352-352: Comment relocation when inlining: LGTM

Warns appropriately and preserves readability.


354-358: Multiline path over all params: LGTM

Iteration + delegating to format_param keeps behavior consistent.

Code for searching for and parsing snakefmt configuration files
"""

import tomllib
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also require an update in the dependencies in snakefmt's pyproject.toml

@mbhall88 mbhall88 merged commit 2f8e693 into master Aug 31, 2025
9 of 11 checks passed
@mbhall88 mbhall88 deleted the fix-190 branch August 31, 2025 23:15
@Hocnonsense Hocnonsense self-assigned this Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment