Skip to content

Comments

[ty] Offer "Did you mean...?" suggestions for unresolved from imports and unresolved attributes#18705

Merged
AlexWaygood merged 22 commits intomainfrom
alex-brent/did-you-mean
Jun 17, 2025
Merged

[ty] Offer "Did you mean...?" suggestions for unresolved from imports and unresolved attributes#18705
AlexWaygood merged 22 commits intomainfrom
alex-brent/did-you-mean

Conversation

@AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Jun 16, 2025

Summary

This PR (a collaboration with @ntBre) adds "Did you mean...?" suggestions for unresolved from imports and unresolved attributes. For example:

image

The implementation uses the Levenshtein distance algorithm to calculate which possible member of the type is a closest match to the unresolved member that triggered the diagnostic. This specific Levenshtein implementation is an almost exact 1:1 port of CPython's implementation here, and many of the tests in this PR are also ported from CPython's tests. You can see CPython's Levenshtein implementation in action if you access a nonexistent attribute or import in the REPL:

Python 3.13.1 (main, Jan  3 2025, 12:04:03) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import collections
>>> collections.dequee
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    collections.dequee
AttributeError: module 'collections' has no attribute 'dequee'. Did you mean: 'deque'?
>>> from asyncio import gett_event_lop
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    from asyncio import gett_event_lop
ImportError: cannot import name 'gett_event_lop' from 'asyncio' (/Users/alexw/.pyenv/versions/3.13.1/lib/python3.13/asyncio/__init__.py). Did you mean: 'get_event_loop'?
>>> 

Our motivation for copying CPython's implementation almost exactly is that CPython has had this feature for several Python versions now, and during that time many bug reports have been filed regarding incorrect suggestions, which have since been fixed. This implementation is thus very well "battle-tested" by this point; we can say with a reasonable degree of confidence that it gives good suggestions for unresolved members in the Python context.

Test Plan

  • Unit tests in levenshtein.rs
  • Mdtest integration tests

Co-authored-by: Brent Westbrook [email protected]

@AlexWaygood AlexWaygood added ty Multi-file analysis & type inference diagnostics Related to reporting of diagnostics. labels Jun 16, 2025
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module has been renamed from ide_support to all_members because it's no longer just used for autocompletions; it's now also used for diagnostic suggestions.

@AlexWaygood AlexWaygood requested a review from MichaReiser as a code owner June 16, 2025 12:28
@AlexWaygood
Copy link
Member Author

AlexWaygood commented Jun 16, 2025

Mypy_primer is failing because this .unwrap() call is panicking when checking sympy:

let name = instance_attribute.sub_segments()[0].as_member().unwrap();

Which seems like it is probably a pre-existing bug exposed by this PR (but we obviously won't be able to land this PR without fixing it!).

@AlexWaygood AlexWaygood force-pushed the alex-brent/did-you-mean branch from 76a50bf to f7cdb88 Compare June 16, 2025 12:52
@AlexWaygood
Copy link
Member Author

Minimal repro for the primer panic on this branch:

class Point:
    def orthogonal_direction(self):
        self[0].is_zero


def test_point(p2: Point):
    p2.y

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Jun 16, 2025

And yes, it's easy to trigger the panic on main by adding this test:

diff --git a/crates/ty_ide/src/completion.rs b/crates/ty_ide/src/completion.rs
index a8b93080b4..9f0a013b31 100644
--- a/crates/ty_ide/src/completion.rs
+++ b/crates/ty_ide/src/completion.rs
@@ -1835,6 +1835,23 @@ f"{f.<CURSOR>
         test.assert_completions_include("method");
     }
 
+    #[test]
+    fn no_panic_for_attribute_table_that_contains_subscript() {
+        let test = cursor_test(
+            r#"
+class Point:
+    def orthogonal_direction(self):
+        self[0].is_zero
+
+def test_point(p2: Point):
+    p2.<CURSOR>
+"#,
+        );
+
+        let completions = test.completions();
+        assert_eq!(completions, "orthogonal_direction");
+    }

@AlexWaygood
Copy link
Member Author

The primer panic should be fixed by #18707, but I won't attempt to rebase this PR branch on top of that as the commit history got somewhat messy here 😆

@AlexWaygood
Copy link
Member Author

Mypy_primer is now timing out, but it appears that it ran ty successfully on all selected projects. I'm guessing it's taking ages to compute the diff between old_ty's results and new_ty's results, because there are many, many diagnostics where the message will change like so:

- error[unresolved-attribute] Type `Foo` has no attribute `bar`
+ error[unresolved-attribute] Type `Foo` has no attribute `bar`: Did you mean `baz`?

So this PR is ready for review!

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the tests and did a high-level review of the implementation.

This will probably make us look a lot slower on early benchmarks of codebases where we raise a lot of AttributeError false positives, but so it goes; we should eliminate the false positive AttributeErrors.

Very cool feature!

}

// Filter out the unresolved member itself.
// Otherwise (due to our implementation of implicit instance attributes),
Copy link
Contributor

@carljm carljm Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this. It sounds like this is saying that we consider an access of self.attribute (no assignment) sufficient to include attribute in all_members? Do we really do that? Why?

Copy link
Member Author

@AlexWaygood AlexWaygood Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like this is saying that we consider an access of self.attribute (no assignment) sufficient to include attribute in all_members? Do we really do that?

That does appear to be the case, yes. If I make this diff to the PR branch:

diff --git a/crates/ty_python_semantic/src/types/diagnostic/levenshtein.rs b/crates/ty_python_semantic/src/types/diagnostic/levenshtein.rs
index 11fc59674e..ff1fc86a28 100644
--- a/crates/ty_python_semantic/src/types/diagnostic/levenshtein.rs
+++ b/crates/ty_python_semantic/src/types/diagnostic/levenshtein.rs
@@ -66,18 +66,6 @@ where
         return None;
     }
 
-    // Filter out the unresolved member itself.
-    // Otherwise (due to our implementation of implicit instance attributes),
-    // we end up giving bogus suggestions like this:
-    //
-    // ```python
-    // class Foo:
-    //     _attribute = 42
-    //     def bar(self):
-    //         print(self.attribute)  # error: unresolved attribute `attribute`; did you mean `attribute`?
-    // ```
-    let options = options.filter(|name| name != unresolved_member);
-

then we get these changes to the snapshots:

image

And those seem to be self-evidently worse "Did you mean?" suggestions 😆

Why?

I'm not sure. I agree that it seems a bit suspect. @sharkdp, do you have any ideas off the top of your head for why attribute might be included in the list returned by all_members here? If not, I can look into this as a followup.

@AlexWaygood AlexWaygood merged commit 913f136 into main Jun 17, 2025
34 of 35 checks passed
@AlexWaygood AlexWaygood deleted the alex-brent/did-you-mean branch June 17, 2025 10:10
AlexWaygood added a commit that referenced this pull request Jun 17, 2025
…m` imports and unresolved attributes (#18705)"

This reverts commit 913f136.
AlexWaygood added a commit that referenced this pull request Jun 17, 2025
carljm added a commit to MatthewMckee4/ruff that referenced this pull request Jun 17, 2025
…ence

* main: (71 commits)
  Bump 0.12.0 (astral-sh#18724)
  Revert "[ty] Offer "Did you mean...?" suggestions for unresolved `from` imports and unresolved attributes (astral-sh#18705)" (astral-sh#18721)
  [`flake8-return`] Stabilize only add `return None` at the end when fixing `implicit-return` (`RET503`) (astral-sh#18516)
  [`pyupgrade`] Stabilize `non-pep695-generic-function` (`UP047`) (astral-sh#18524)
  [`pyupgrade`] Stabilize `non-pep695-generic-class` (`UP046`) (astral-sh#18519)
  [`pandas-vet`] Deprecate `pandas-df-variable-name` (`PD901`) (astral-sh#18618)
  [`flake8-bandit`] Remove `suspicious-xmle-tree-usage` (`S320`) (astral-sh#18617)
  Stabilize `dataclass-enum` (`RUF049`) (astral-sh#18570)
  Stabilize `unnecessary-dict-index-lookup` (`PLR1733`) (astral-sh#18571)
  Remove rust-toolchain.toml from sdist (astral-sh#17925)
  Stabilize `starmap-zip` (`RUF058`) (astral-sh#18525)
  [`flake8-logging`] Stabilize `exc-info-outside-except-handler` (`LOG014`) (astral-sh#18517)
  [`pyupgrade`] Stabilize `non-pep604-annotation-optional` (`UP045`) and preview behavior for `non-pep604-annotation-union` (`UP007`) (astral-sh#18505)
  Stabilize `pytest-warns-too-broad` (`PT030`) (astral-sh#18568)
  Stabilize `for-loop-writes` (`FURB122`) (astral-sh#18565)
  Stabilize `pytest-warns-with-multiple-statements` (`PT031`) (astral-sh#18569)
  Stabilize `pytest-parameter-with-default-argument` (`PT028`) (astral-sh#18566)
  Stabilize `nan-comparison` (`PLW0177`) (astral-sh#18559)
  Stabilize `check-and-remove-from-set` (`FURB132`) (astral-sh#18560)
  Stabilize `unnecessary-round` (`RUF057`) (astral-sh#18563)
  ...
AlexWaygood added a commit that referenced this pull request Jun 17, 2025
…lved `from` imports and unresolved attributes (#18705)" (#18721)"

This reverts commit 685eac1.
AlexWaygood added a commit that referenced this pull request Jun 18, 2025
…lved `from` imports and unresolved attributes (#18705)" (#18721)"

This reverts commit 685eac1.
AlexWaygood added a commit that referenced this pull request Feb 16, 2026
…23291)

## Summary

For a couple of diagnostics currently, we add a "Did you mean...?"
diagnostic hint if it appears like there's an obvious typo that caused
us to emit an error. The "Did you mean...?" suggestion is generated via
the `strsim` Levenshtein implementation on `crates.io`.

This PR replaces the `strsim` implementation of Levenshtein used to
create these hints with a custom Levenshtein implementation based on the
one that CPython itself uses to create these hints:

```pycon
>>> class Foo:
...     xyxy = 42
...     
>>> Foo.xyxyz
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    Foo.xyxyz
AttributeError: type object 'Foo' has no attribute 'xyxyz'. Did you mean: 'xyxy'?
```

The added tests are also derived from CPython's test suite.

The motivation for copying CPython's implementation almost exactly is
that CPython has had this feature for several Python versions now, and
during that time many bug reports have been filed regarding incorrect
suggestions, which have since been fixed. This implementation is thus
very well "battle-tested" by this point; we can say with a reasonable
degree of confidence that it gives good suggestions for typos in the
Python context.

The ecosystem report on this PR bears out that this is an improvement.
We see bad suggestions going away:

```diff
- [error] invalid-key - Unknown key "pair" for TypedDict `RPCAnalyzedDFMsg` - did you mean "data"?
+ [error] invalid-key - Unknown key "pair" for TypedDict `RPCAnalyzedDFMsg`: Unknown key "pair"
```

and good suggestions being added:

```diff
- [error] invalid-key - Unknown key "old_entity_id" for TypedDict `_EventEntityRegistryUpdatedData_CreateRemove`: Unknown key "old_entity_id"
+ [error] invalid-key - Unknown key "old_entity_id" for TypedDict `_EventEntityRegistryUpdatedData_CreateRemove` - did you mean "entity_id"?
```

This Levenshtein implementation was originally proposed in #18705, and
then again in #18751. Those PRs also made other changes to use the
Levenshtein implementation in certain other areas, however, where
computing the list of suggestions to pass into the Levenshtein algorithm
turned out to be prohibitively expensive. This PR therefore _only_
updates the Levenshtein implementation being used for our existing
subdiagnostics, rather than expanding the callsites of the Levenshtein
implementation.

## Test plan

Unit tests have been added in `levenshtein.rs`. Some mdtests and
snapshots were updated to ensure that they still test what they're meant
to be testing, even with the new Levenshtein implementation.

Co-authored-by: Brent Westbrook <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diagnostics Related to reporting of diagnostics. ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants