Handle multiple unused submodule imports by squiddy · Pull Request #666 · astral-sh/ruff

squiddy · 2022-11-10T06:03:22Z

I've got the tests and the right behaviour in place, but the code is not pretty right now.

Cleanup code

This resolves the issue where ruff doesn't see any unused imports in code like this:

import multiprocessing.pool
import multiprocessing.process
z = multiprocessing.pool.ThreadPool()

The underlying issue was twofold:

When tracking submodule imports, we used the first part of the module (multiprocessing in this example) as the key in the bindings map. Therefore both submodule imports would end up as one entry, with no way to differentiate between them.
When tracking references to imports, we only looked at Name expressions. A compound (attribute) expression such as multiprocessing.process resulted in two attempts to mark names/imports used:

One for multiprocessing and one for process. Marking multiprocessing used together with the issue in 1) meant that all imports of multiprocessing were marked as used.

The solution:

For submodule imports, use the full name as key in the bindings map.
Detect Attribute access (such as foo.bar.baz, but not foo.bar["baz"]) and use that to check if it matches an import. If the visitor is not in such an Attribute in the tree right now, keep looking at Name expressions.

With an attribute access like multiprocessing.pool.ThreadPool, iterate through all parts of it and try to look them up in bindings. This is necessary because certain modules can have different access patterns, e.g.
```
import os
os.path.exists("foo")
```

instead of

   import os.path
   os.path.exists("foo")

Closes #60

charliermarsh · 2022-11-11T00:39:19Z

Awesome! Will review this soon!

charliermarsh · 2022-11-11T22:08:30Z

@squiddy - Re-reading the summary -- do you want a code review on this, or did you wanna do any cleanup first?

squiddy · 2022-11-12T05:34:17Z

I should have specified that more. I'd really appreciate a code review on the approach itself, the cleanup part is primary about the changes to handle_node_load.

charliermarsh · 2022-11-12T17:15:56Z

You got it :)

charliermarsh · 2022-11-13T15:54:48Z

(Overall, what you've written in the summary makes sense. Haven't read the code yet.)

src/check_ast.rs

charliermarsh · 2022-11-20T22:05:33Z

@squiddy - Would you like me to clean up based on the comments and get this merged? Totally up to you.

squiddy · 2022-11-21T05:53:03Z

@squiddy - Would you like me to clean up based on the comments and get this merged? Totally up to you.

I've give it a try, thanks for the review.

charliermarsh · 2022-11-26T21:28:52Z

Cool, gonna profile this tonight to make sure it's not a significant regression, then we can merge.

charliermarsh · 2022-11-27T04:53:34Z

Introduces a small slowdown so wanna see if I can optimize this a bit, then will merge.

squiddy · 2022-12-09T10:40:21Z

@charliermarsh I know you're very busy - good evidence to the success of ruff - so I'm not being pushy here.

Can I help you here move this forward? If it's the slowdown that is blocking this, how would I go about that measuring that? Is it just hyperfine + running ruff on the cmdline, or did you do some in-depth profiling?

I'm happy for anything you can share that helps me help you. :)

charliermarsh · 2022-12-09T21:29:11Z

Thanks for this really kind and understanding message (and not pushy at all). I've been a bad maintainer on this one, asking you to make changes and then failing to follow-up, so I apologize.

I did try to get this across the finish line once or twice but was struggling to eke out any more performance gains and felt mixed about introducing a performance penalty. But the blocker now is that I need to do a fairly substantial rebase due to #1147 and #1137. If you're up for trying to handle that, it'd be appreciated. But otherwise, I'll continue to try to get to this when I can.

squiddy · 2022-12-10T04:09:21Z

I'm doing the rebase now and see what else I can do.

src/pyflakes/mod.rs

This resolves the issue where ruff doesn't see any unused imports in code like this: import multiprocessing.pool import multiprocessing.process z = multiprocessing.pool.ThreadPool() The underlying issue was twofold: 1. When tracking submodule imports, we used the first part of the module (multiprocessing in this example) as the key in the bindings map. Therefore both submodule imports would end up as one entry, with no way to differentiate between them. 2. When tracking references to imports, we only looked at Name expressions. A compound (attribute) expression such as multiprocessing.process resulted in two attempts to mark names/imports used: One for multiprocessing and one for process. Marking multiprocessing used together with the issue in 1) meant that all imports of multiprocessing were marked as used. The solution: 1. For submodule imports, use the full name as key in the bindings map. 2. Detect Attribute access (such as foo.bar.baz, but not foo.bar["baz"]) and use that to check if it matches an import. If the visitor is not in such an Attribute in the tree right now, keep looking at Name expressions. With an attribute access like `multiprocessing.pool.ThreadPool`, iterate through all parts of it and try to look them up in bindings. This is necessary because certain modules can have different access patterns, e.g. import os os.path.exists("foo") instead of import os.path os.path.exists("foo")

squiddy · 2022-12-10T07:29:30Z

Currently experimenting with putting this attribute specific logic after the "simple" name-based lookup, under the assumption that Name are way more common generally than just inside Attribute, e.g. in function calls or variable lookups.

Benchmarks against CPython look promising, but I need to fix one more test. 🤞

squiddy · 2022-12-26T07:18:23Z

Closing for now. I haven't made any good progress, might look into that again at some later point.

…ed-import` (`F401`) (#20200) # Summary The PR under review attempts to make progress towards the age-old problem of submodule imports, specifically with regards to their treatment by the rule [`unused-import` (`F401`)](https://docs.astral.sh/ruff/rules/unused-import/). Some related issues: - #60 - #4656 Prior art: - #13965 - #5010 - #5011 - #666 See the PR summary for a detailed description.

charliermarsh marked this pull request as ready for review November 12, 2022 17:15

charliermarsh reviewed Nov 13, 2022

View reviewed changes

src/check_ast.rs Outdated Show resolved Hide resolved

src/check_ast.rs Outdated Show resolved Hide resolved

src/check_ast.rs Outdated Show resolved Hide resolved

src/check_ast.rs Outdated Show resolved Hide resolved

src/check_ast.rs Outdated Show resolved Hide resolved

squiddy force-pushed the handle-multiple-unused-submodule-imports branch 3 times, most recently from 21f5f1e to d61fcbf Compare November 26, 2022 08:44

squiddy force-pushed the handle-multiple-unused-submodule-imports branch from 598b3b4 to fb93856 Compare December 10, 2022 04:24

squiddy commented Dec 10, 2022

View reviewed changes

src/pyflakes/mod.rs Outdated Show resolved Hide resolved

squiddy force-pushed the handle-multiple-unused-submodule-imports branch from fb93856 to 4fdf7bd Compare December 10, 2022 04:46

squiddy closed this Dec 26, 2022

dylwil3 mentioned this pull request Sep 12, 2025

[pyflakes] Handle some common submodule import situations for unused-import (F401) #20200

Merged

Comments

Conversation

squiddy commented Nov 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charliermarsh commented Nov 11, 2022

Uh oh!

charliermarsh commented Nov 11, 2022

Uh oh!

squiddy commented Nov 12, 2022

Uh oh!

charliermarsh commented Nov 12, 2022

Uh oh!

charliermarsh commented Nov 13, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

charliermarsh commented Nov 20, 2022

Uh oh!

squiddy commented Nov 21, 2022

Uh oh!

charliermarsh commented Nov 26, 2022

Uh oh!

charliermarsh commented Nov 27, 2022

Uh oh!

squiddy commented Dec 9, 2022

Uh oh!

charliermarsh commented Dec 9, 2022

Uh oh!

squiddy commented Dec 10, 2022

Uh oh!

Uh oh!

squiddy commented Dec 10, 2022

Uh oh!

squiddy commented Dec 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

squiddy commented Nov 10, 2022 •

edited

Loading