Skip to content

Comments

Add missing rule code comments#18906

Merged
ntBre merged 3 commits intoastral-sh:mainfrom
MeGaGiGaGon:add-rule-code-comments
Jun 25, 2025
Merged

Add missing rule code comments#18906
ntBre merged 3 commits intoastral-sh:mainfrom
MeGaGiGaGon:add-rule-code-comments

Conversation

@MeGaGiGaGon
Copy link
Contributor

Summary

While making some of my other changes, I noticed some of the lints were missing comments with their lint code/had the wrong numbered lint code. These comments are super useful since they allow for very easily and quickly finding the source code of a lint, so I decided to try and normalize them.

Most of them were fairly straightforward, just adding a doc comment/comment in the appropriate place.

I decided to make all of the Pylint rules have the PL prefix. Previously it was split between no prefix and having prefix, but I decided to normalize to with prefix since that's what's in the docs, and the with prefix will show up on no prefix searches, while the reverse is not true.

I also ran into a lot of rules with implementations in "non-standard" places (where "standard" means inside a file matching the glob crates/ruff_linter/rules/*/rules/**/*.rs and/or the same rule file where the rule struct/ViolationMetadata is defined).

I decided to move all the implementations out of crates/ruff_linter/src/checkers/ast/analyze/deferred_scopes.rs and into their own files, since that is what the rest of the rules in deferred_scopes.rs did, and those were just the outliers.

There were several rules which I did not end up moving, which you can see as the extra paths I had to add to my python code besides the "standard" glob. These rules are generally the error-type rules that just wrap an error from the parser, and have very small implementations/are very tightly linked to the module they are in, and generally every rule of that type was implemented in module instead of in the "standard" place.

Resolving that requires answering a question I don't think I'm equipped to handle: Is the point of these comments to give quick access to the rule definition/docs, or the rule implementation? For all the rules with implementations in the "standard" location this isn't a problem, as they are the same, but it is an issue for all of these error type rules. In the end I chose to leave the implementations where they were, but I'm not sure if that was the right choice.

Python script I wrote to find missing comments

This script assumes it is placed in the top level ruff directory (ie next to .git/crates/README.md)

import re
from copy import copy
from pathlib import Path

linter_to_code_prefix = {
    "Airflow": "AIR",
    "Eradicate": "ERA",
    "FastApi": "FAST",
    "Flake82020": "YTT",
    "Flake8Annotations": "ANN",
    "Flake8Async": "ASYNC",
    "Flake8Bandit": "S",
    "Flake8BlindExcept": "BLE",
    "Flake8BooleanTrap": "FBT",
    "Flake8Bugbear": "B",
    "Flake8Builtins": "A",
    "Flake8Commas": "COM",
    "Flake8Comprehensions": "C4",
    "Flake8Copyright": "CPY",
    "Flake8Datetimez": "DTZ",
    "Flake8Debugger": "T10",
    "Flake8Django": "DJ",
    "Flake8ErrMsg": "EM",
    "Flake8Executable": "EXE",
    "Flake8Fixme": "FIX",
    "Flake8FutureAnnotations": "FA",
    "Flake8GetText": "INT",
    "Flake8ImplicitStrConcat": "ISC",
    "Flake8ImportConventions": "ICN",
    "Flake8Logging": "LOG",
    "Flake8LoggingFormat": "G",
    "Flake8NoPep420": "INP",
    "Flake8Pie": "PIE",
    "Flake8Print": "T20",
    "Flake8Pyi": "PYI",
    "Flake8PytestStyle": "PT",
    "Flake8Quotes": "Q",
    "Flake8Raise": "RSE",
    "Flake8Return": "RET",
    "Flake8Self": "SLF",
    "Flake8Simplify": "SIM",
    "Flake8Slots": "SLOT",
    "Flake8TidyImports": "TID",
    "Flake8Todos": "TD",
    "Flake8TypeChecking": "TC",
    "Flake8UnusedArguments": "ARG",
    "Flake8UsePathlib": "PTH",
    "Flynt": "FLY",
    "Isort": "I",
    "McCabe": "C90",
    "Numpy": "NPY",
    "PandasVet": "PD",
    "PEP8Naming": "N",
    "Perflint": "PERF",
    "Pycodestyle": "",
    "Pydoclint": "DOC",
    "Pydocstyle": "D",
    "Pyflakes": "F",
    "PygrepHooks": "PGH",
    "Pylint": "PL",
    "Pyupgrade": "UP",
    "Refurb": "FURB",
    "Ruff": "RUF",
    "Tryceratops": "TRY",
}

ruff = Path(__file__).parent / "crates"

ruff_linter = ruff / "ruff_linter" / "src"

code_to_rule_name = {}

with open(ruff_linter / "codes.rs") as codes_file:
    for linter, code, rule_name in re.findall(
        # The (?<! skips ruff test rules
        # Only Preview|Stable rules are checked
        r"(?<!#\[cfg\(any\(feature = \"test-rules\", test\)\)\]\n)        \((\w+), \"(\w+)\"\) => \(RuleGroup::(?:Preview|Stable), [\w:]+::(\w+)\)",
        codes_file.read(),
    ):
        code_to_rule_name[linter_to_code_prefix[linter] + code] = (rule_name, [])

ruff_linter_rules = ruff_linter / "rules"
for rule_file_path in [
    *ruff_linter_rules.rglob("*/rules/**/*.rs"),
    ruff / "ruff_python_parser" / "src" / "semantic_errors.rs",
    ruff_linter / "pyproject_toml.rs",
    ruff_linter / "checkers" / "noqa.rs",
    ruff_linter / "checkers" / "ast" / "mod.rs",
    ruff_linter / "checkers" / "ast" / "analyze" / "unresolved_references.rs",
    ruff_linter / "checkers" / "ast" / "analyze" / "expression.rs",
    ruff_linter / "checkers" / "ast" / "analyze" / "statement.rs",
]:
    with open(rule_file_path, encoding="utf-8") as f:
        rule_file_content = f.read()
    for code, (rule, _) in copy(code_to_rule_name).items():
        if rule in rule_file_content:
            if f"// {code}" in rule_file_content or f", {code}" in rule_file_content:
                del code_to_rule_name[code]
            else:
                code_to_rule_name[code][1].append(rule_file_path)

for code, rule in code_to_rule_name.items():
    print(code, rule[0])
    for path in rule[1]:
        print(path)

Test Plan

N/A, no tests/functionality affected.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 24, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I always grep for the rule codes first, so this will definitely help me. I'm curious if @MichaReiser has any strong feelings about moving the rule implementations, but I think that makes sense to me too.

Resolving that requires answering a question I don't think I'm equipped to handle: Is the point of these comments to give quick access to the rule definition/docs, or the rule implementation?

If I had to pick one, I'd probably say implementation. The implementation is virtually guaranteed to include a reference to the Violation struct, and it's a bit easier to jump to definition on that struct than find all references from the struct definition, in my opinion. So I think you put them in the right place, for my tastes at least.

I also think it made sense to leave those very short rules in place. They're basically one-liners, especially if we switch to report_diagnostic_if_enabled.

@ntBre ntBre added the internal An internal refactor or improvement label Jun 24, 2025
@MichaReiser
Copy link
Member

MichaReiser commented Jun 24, 2025

I'd prefer if we can split the PR into

  1. Add the missing rule comments
  2. Anything changing the structure

That makes the discussion easier. I personally don't find "standard locations" to be that important to find an implementation because I use go to definition or find all references to find the implementation. It's also often easier to understand the implementation if everything is in one place.

@MeGaGiGaGon MeGaGiGaGon force-pushed the add-rule-code-comments branch from 0b9820b to 57f62cd Compare June 24, 2025 17:36
@MeGaGiGaGon
Copy link
Contributor Author

I'd prefer if we can split the PR into

  1. Add the missing rule comments
  2. Anything changing the structure

That makes the discussion easier. I personally don't find "standard locations" to be that important to find an implementation because I use go to definition or find all references to find the implementation. It's also often easier to understand the implementation if everything is in one place.

Split out, I think I did all the git stuff correct. I'll wait to make a PR on the movement parts since the comments for the ones that I'm considering moving would not be moved in that branch, making a mess. Those changes are now in this branch

I guess in the end I prefer the comments on the implementation, since that's what I usually want to see. Having them gets rid of a jump or two, since while it's not too hard it does take a while to navigate through the links and find what you're looking for. The unfortunate part of having the comments on the implementations is that it's harder to get to the docs if that's your goal, but oh well.

@MichaReiser
Copy link
Member

Split out, I think I did all the git stuff correct.

Thank you! And sorry for the extra work.

The unfortunate part of having the comments on the implementations is that it's harder to get to the docs if that's your goal, but oh well.

Yeah, there's no perfect solution. What I do is that I search where we emit the diagnostic and then use Go to definition to jump to the violation definition. I find that easier than going from the violation definition to the implementation (IDEs are slower at finding all references)

Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!

@ntBre ntBre merged commit 90f47e9 into astral-sh:main Jun 25, 2025
35 checks passed
@MeGaGiGaGon MeGaGiGaGon deleted the add-rule-code-comments branch June 25, 2025 01:42
ntBre pushed a commit that referenced this pull request Jun 25, 2025
<!--
Thank you for contributing to Ruff/ty! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title? (Please prefix
with `[ty]` for ty pull
  requests.)
- Does this pull request include references to any relevant issues?
-->

## Summary

<!-- What's the purpose of the change? What does it do, and why? -->

From @ntBre
#18906 (comment) :
> This could be a good target for a follow-up PR, but we could fold
these `if checker.is_rule_enabled { checker.report_diagnostic` checks
into calls to `checker.report_diagnostic_if_enabled`. I didn't notice
these when adding that method.
> 
> Also, the docs on `Checker::report_diagnostic_if_enabled` and
`LintContext::report_diagnostic_if_enabled` are outdated now that the
`Rule` conversion is basically free 😅
> 
> No pressure to take on this refactor, just an idea if you're
interested!

This PR folds those calls. I also updated the doc comments by copying
from `report_diagnostic`.

Note: It seems odd to me that the doc comment for `Checker` says
`Diagnostic` while `LintContext` says `OldDiagnostic`, not sure if that
needs a bigger docs change to fix the inconsistency.

<details>
<summary>Python script to do the changes</summary>

This script assumes it is placed in the top level `ruff` directory (ie
next to `.git`/`crates`/`README.md`)

```py
import re
from copy import copy
from pathlib import Path

ruff_crates = Path(__file__).parent / "crates"

for path in ruff_crates.rglob("**/*.rs"):
    with path.open(encoding="utf-8", newline="") as f:
        original_content = f.read()
    if "is_rule_enabled" not in original_content or "report_diagnostic" not in original_content:
        continue
    original_content_position = 0
    changed_content = ""
    for match in re.finditer(r"(?m)(?:^[ \n]*|(?<=(?P<else>else )))if[ \n]+checker[ \n]*\.is_rule_enabled\([ \n]*Rule::\w+[ \n]*\)[ \n]*{[ \n]*checker\.report_diagnostic\(", original_content):
        # Content between last match and start of this one is unchanged
        changed_content += original_content[original_content_position:match.start()]
        # If this was an else if, a { needs to be added at the start
        if match.group("else"):
            changed_content += "{"
        # This will result in bad formatting, but the precommit cargo format will handle it
        changed_content += "checker.report_diagnostic_if_enabled("
        # Depth tracking would fail if a string/comment included a { or }, but unlikely given the context
        depth = 1
        position = match.end()
        while depth > 0:
            if original_content[position] == "{":
                depth += 1
            if original_content[position] == "}":
                depth -= 1
            position += 1
        # pos - 1 is the closing }
        changed_content += original_content[match.end():position - 1]
        # If this was an else if, a } needs to be added at the end
        if match.group("else"):
            changed_content += "}"
        # Skip the closing }
        original_content_position = position
        if original_content[original_content_position] == "\n":
            # If the } is followed by a \n, also skip it for better formatting
            original_content_position += 1
    # Add remaining content between last match and file end
    changed_content += original_content[original_content_position:]
    with path.open("w", encoding="utf-8", newline="") as f:
        f.write(changed_content)
```

</details>

## Test Plan

<!-- How was it tested? -->

N/A, no tests/functionality affected.
ntBre pushed a commit that referenced this pull request Jun 25, 2025
<!--
Thank you for contributing to Ruff/ty! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title? (Please prefix
with `[ty]` for ty pull
  requests.)
- Does this pull request include references to any relevant issues?
-->

## Summary

<!-- What's the purpose of the change? What does it do, and why? -->

Here's the part that was split out of #18906. I wanted to move these
into the rule files since the rest of the rules in
`deferred_scope`/`statement` have that same structure of implementations
being in the rule definition file. It also resolves the dilemma of where
to put the comment, at least for these rules.

## Test Plan

<!-- How was it tested? -->

N/A, no test/functionality affected
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal An internal refactor or improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants