Add missing rule code comments by MeGaGiGaGon · Pull Request #18906 · astral-sh/ruff

MeGaGiGaGon · 2025-06-24T00:48:05Z

Summary

While making some of my other changes, I noticed some of the lints were missing comments with their lint code/had the wrong numbered lint code. These comments are super useful since they allow for very easily and quickly finding the source code of a lint, so I decided to try and normalize them.

Most of them were fairly straightforward, just adding a doc comment/comment in the appropriate place.

I decided to make all of the Pylint rules have the PL prefix. Previously it was split between no prefix and having prefix, but I decided to normalize to with prefix since that's what's in the docs, and the with prefix will show up on no prefix searches, while the reverse is not true.

I also ran into a lot of rules with implementations in "non-standard" places (where "standard" means inside a file matching the glob crates/ruff_linter/rules/*/rules/**/*.rs and/or the same rule file where the rule struct/ViolationMetadata is defined).

I decided to move all the implementations out of crates/ruff_linter/src/checkers/ast/analyze/deferred_scopes.rs and into their own files, since that is what the rest of the rules in deferred_scopes.rs did, and those were just the outliers.

There were several rules which I did not end up moving, which you can see as the extra paths I had to add to my python code besides the "standard" glob. These rules are generally the error-type rules that just wrap an error from the parser, and have very small implementations/are very tightly linked to the module they are in, and generally every rule of that type was implemented in module instead of in the "standard" place.

Resolving that requires answering a question I don't think I'm equipped to handle: Is the point of these comments to give quick access to the rule definition/docs, or the rule implementation? For all the rules with implementations in the "standard" location this isn't a problem, as they are the same, but it is an issue for all of these error type rules. In the end I chose to leave the implementations where they were, but I'm not sure if that was the right choice.

Python script I wrote to find missing comments

This script assumes it is placed in the top level ruff directory (ie next to .git/crates/README.md)

import re
from copy import copy
from pathlib import Path

linter_to_code_prefix = {
    "Airflow": "AIR",
    "Eradicate": "ERA",
    "FastApi": "FAST",
    "Flake82020": "YTT",
    "Flake8Annotations": "ANN",
    "Flake8Async": "ASYNC",
    "Flake8Bandit": "S",
    "Flake8BlindExcept": "BLE",
    "Flake8BooleanTrap": "FBT",
    "Flake8Bugbear": "B",
    "Flake8Builtins": "A",
    "Flake8Commas": "COM",
    "Flake8Comprehensions": "C4",
    "Flake8Copyright": "CPY",
    "Flake8Datetimez": "DTZ",
    "Flake8Debugger": "T10",
    "Flake8Django": "DJ",
    "Flake8ErrMsg": "EM",
    "Flake8Executable": "EXE",
    "Flake8Fixme": "FIX",
    "Flake8FutureAnnotations": "FA",
    "Flake8GetText": "INT",
    "Flake8ImplicitStrConcat": "ISC",
    "Flake8ImportConventions": "ICN",
    "Flake8Logging": "LOG",
    "Flake8LoggingFormat": "G",
    "Flake8NoPep420": "INP",
    "Flake8Pie": "PIE",
    "Flake8Print": "T20",
    "Flake8Pyi": "PYI",
    "Flake8PytestStyle": "PT",
    "Flake8Quotes": "Q",
    "Flake8Raise": "RSE",
    "Flake8Return": "RET",
    "Flake8Self": "SLF",
    "Flake8Simplify": "SIM",
    "Flake8Slots": "SLOT",
    "Flake8TidyImports": "TID",
    "Flake8Todos": "TD",
    "Flake8TypeChecking": "TC",
    "Flake8UnusedArguments": "ARG",
    "Flake8UsePathlib": "PTH",
    "Flynt": "FLY",
    "Isort": "I",
    "McCabe": "C90",
    "Numpy": "NPY",
    "PandasVet": "PD",
    "PEP8Naming": "N",
    "Perflint": "PERF",
    "Pycodestyle": "",
    "Pydoclint": "DOC",
    "Pydocstyle": "D",
    "Pyflakes": "F",
    "PygrepHooks": "PGH",
    "Pylint": "PL",
    "Pyupgrade": "UP",
    "Refurb": "FURB",
    "Ruff": "RUF",
    "Tryceratops": "TRY",
}

ruff = Path(__file__).parent / "crates"

ruff_linter = ruff / "ruff_linter" / "src"

code_to_rule_name = {}

with open(ruff_linter / "codes.rs") as codes_file:
    for linter, code, rule_name in re.findall(
        # The (?<! skips ruff test rules
        # Only Preview|Stable rules are checked
        r"(?<!#\[cfg\(any\(feature = \"test-rules\", test\)\)\]\n)        \((\w+), \"(\w+)\"\) => \(RuleGroup::(?:Preview|Stable), [\w:]+::(\w+)\)",
        codes_file.read(),
    ):
        code_to_rule_name[linter_to_code_prefix[linter] + code] = (rule_name, [])

ruff_linter_rules = ruff_linter / "rules"
for rule_file_path in [
    *ruff_linter_rules.rglob("*/rules/**/*.rs"),
    ruff / "ruff_python_parser" / "src" / "semantic_errors.rs",
    ruff_linter / "pyproject_toml.rs",
    ruff_linter / "checkers" / "noqa.rs",
    ruff_linter / "checkers" / "ast" / "mod.rs",
    ruff_linter / "checkers" / "ast" / "analyze" / "unresolved_references.rs",
    ruff_linter / "checkers" / "ast" / "analyze" / "expression.rs",
    ruff_linter / "checkers" / "ast" / "analyze" / "statement.rs",
]:
    with open(rule_file_path, encoding="utf-8") as f:
        rule_file_content = f.read()
    for code, (rule, _) in copy(code_to_rule_name).items():
        if rule in rule_file_content:
            if f"// {code}" in rule_file_content or f", {code}" in rule_file_content:
                del code_to_rule_name[code]
            else:
                code_to_rule_name[code][1].append(rule_file_path)

for code, rule in code_to_rule_name.items():
    print(code, rule[0])
    for path in rule[1]:
        print(path)

Test Plan

N/A, no tests/functionality affected.

github-actions · 2025-06-24T00:56:48Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

ntBre

Nice! I always grep for the rule codes first, so this will definitely help me. I'm curious if @MichaReiser has any strong feelings about moving the rule implementations, but I think that makes sense to me too.

Resolving that requires answering a question I don't think I'm equipped to handle: Is the point of these comments to give quick access to the rule definition/docs, or the rule implementation?

If I had to pick one, I'd probably say implementation. The implementation is virtually guaranteed to include a reference to the Violation struct, and it's a bit easier to jump to definition on that struct than find all references from the struct definition, in my opinion. So I think you put them in the right place, for my tastes at least.

I also think it made sense to leave those very short rules in place. They're basically one-liners, especially if we switch to report_diagnostic_if_enabled.

crates/ruff_linter/src/checkers/ast/analyze/statement.rs

MichaReiser · 2025-06-24T06:04:43Z

I'd prefer if we can split the PR into

Add the missing rule comments
Anything changing the structure

That makes the discussion easier. I personally don't find "standard locations" to be that important to find an implementation because I use go to definition or find all references to find the implementation. It's also often easier to understand the implementation if everything is in one place.

MeGaGiGaGon · 2025-06-24T17:48:09Z

I'd prefer if we can split the PR into

Add the missing rule comments

Anything changing the structure

That makes the discussion easier. I personally don't find "standard locations" to be that important to find an implementation because I use go to definition or find all references to find the implementation. It's also often easier to understand the implementation if everything is in one place.

Split out, I think I did all the git stuff correct. I'll wait to make a PR on the movement parts since the comments for the ones that I'm considering moving would not be moved in that branch, making a mess. Those changes are now in this branch

I guess in the end I prefer the comments on the implementation, since that's what I usually want to see. Having them gets rid of a jump or two, since while it's not too hard it does take a while to navigate through the links and find what you're looking for. The unfortunate part of having the comments on the implementations is that it's harder to get to the docs if that's your goal, but oh well.

MichaReiser · 2025-06-24T19:13:45Z

Split out, I think I did all the git stuff correct.

Thank you! And sorry for the extra work.

The unfortunate part of having the comments on the implementations is that it's harder to get to the docs if that's your goal, but oh well.

Yeah, there's no perfect solution. What I do is that I search where we emit the diagnostic and then use Go to definition to jump to the violation definition. I find that easier than going from the violation definition to the implementation (IDEs are slower at finding all references)

ntBre

Awesome, thank you!

@ntBre

## Summary  From @ntBre #18906 (comment) : > This could be a good target for a follow-up PR, but we could fold these `if checker.is_rule_enabled { checker.report_diagnostic` checks into calls to `checker.report_diagnostic_if_enabled`. I didn't notice these when adding that method. > > Also, the docs on `Checker::report_diagnostic_if_enabled` and `LintContext::report_diagnostic_if_enabled` are outdated now that the `Rule` conversion is basically free 😅 > > No pressure to take on this refactor, just an idea if you're interested! This PR folds those calls. I also updated the doc comments by copying from `report_diagnostic`. Note: It seems odd to me that the doc comment for `Checker` says `Diagnostic` while `LintContext` says `OldDiagnostic`, not sure if that needs a bigger docs change to fix the inconsistency. <details> <summary>Python script to do the changes</summary> This script assumes it is placed in the top level `ruff` directory (ie next to `.git`/`crates`/`README.md`) ```py import re from copy import copy from pathlib import Path ruff_crates = Path(__file__).parent / "crates" for path in ruff_crates.rglob("**/*.rs"): with path.open(encoding="utf-8", newline="") as f: original_content = f.read() if "is_rule_enabled" not in original_content or "report_diagnostic" not in original_content: continue original_content_position = 0 changed_content = "" for match in re.finditer(r"(?m)(?:^[ \n]*|(?<=(?P<else>else )))if[ \n]+checker[ \n]*\.is_rule_enabled\([ \n]*Rule::\w+[ \n]*\)[ \n]*{[ \n]*checker\.report_diagnostic\(", original_content): # Content between last match and start of this one is unchanged changed_content += original_content[original_content_position:match.start()] # If this was an else if, a { needs to be added at the start if match.group("else"): changed_content += "{" # This will result in bad formatting, but the precommit cargo format will handle it changed_content += "checker.report_diagnostic_if_enabled(" # Depth tracking would fail if a string/comment included a { or }, but unlikely given the context depth = 1 position = match.end() while depth > 0: if original_content[position] == "{": depth += 1 if original_content[position] == "}": depth -= 1 position += 1 # pos - 1 is the closing } changed_content += original_content[match.end():position - 1] # If this was an else if, a } needs to be added at the end if match.group("else"): changed_content += "}" # Skip the closing } original_content_position = position if original_content[original_content_position] == "\n": # If the } is followed by a \n, also skip it for better formatting original_content_position += 1 # Add remaining content between last match and file end changed_content += original_content[original_content_position:] with path.open("w", encoding="utf-8", newline="") as f: f.write(changed_content) ``` </details> ## Test Plan  N/A, no tests/functionality affected.

## Summary  Here's the part that was split out of #18906. I wanted to move these into the rule files since the rest of the rules in `deferred_scope`/`statement` have that same structure of implementations being in the rule definition file. It also resolves the dilemma of where to put the comment, at least for these rules. ## Test Plan  N/A, no test/functionality affected

ntBre reviewed Jun 24, 2025

View reviewed changes

crates/ruff_linter/src/checkers/ast/analyze/statement.rs Show resolved Hide resolved

ntBre added the internal An internal refactor or improvement label Jun 24, 2025

MeGaGiGaGon added 2 commits June 24, 2025 10:15

Non-movement comment adds

ca95ecd

Non-movement tab to space

57f62cd

MeGaGiGaGon force-pushed the add-rule-code-comments branch from 0b9820b to 57f62cd Compare June 24, 2025 17:36

Missing comments from movement branch

0dfff53

MeGaGiGaGon mentioned this pull request Jun 24, 2025

[Internal] Use more report_diagnostic_if_enabled #18924

Merged

ntBre approved these changes Jun 25, 2025

View reviewed changes

ntBre merged commit 90f47e9 into astral-sh:main Jun 25, 2025
35 checks passed

MeGaGiGaGon deleted the add-rule-code-comments branch June 25, 2025 01:42

MeGaGiGaGon mentioned this pull request Jun 25, 2025

Move big rule implementations #18931

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add missing rule code comments#18906

Add missing rule code comments#18906
ntBre merged 3 commits intoastral-sh:mainfrom
MeGaGiGaGon:add-rule-code-comments

MeGaGiGaGon commented Jun 24, 2025

Uh oh!

github-actions bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

ntBre left a comment

Uh oh!

Uh oh!

MichaReiser commented Jun 24, 2025 •

edited

Loading

Uh oh!

MeGaGiGaGon commented Jun 24, 2025

Uh oh!

MichaReiser commented Jun 24, 2025

Uh oh!

ntBre left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

MeGaGiGaGon commented Jun 24, 2025

Summary

Test Plan

Uh oh!

github-actions bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Uh oh!

ntBre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MichaReiser commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MeGaGiGaGon commented Jun 24, 2025

Uh oh!

MichaReiser commented Jun 24, 2025

Uh oh!

ntBre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jun 24, 2025 •

edited

Loading

`ruff-ecosystem` results

MichaReiser commented Jun 24, 2025 •

edited

Loading