Skip to content

Comments

[isort] Check full module path against project root(s) when categorizing first-party#16565

Merged
dylwil3 merged 13 commits intoastral-sh:mainfrom
dylwil3:namespaces
May 5, 2025
Merged

[isort] Check full module path against project root(s) when categorizing first-party#16565
dylwil3 merged 13 commits intoastral-sh:mainfrom
dylwil3:namespaces

Conversation

@dylwil3
Copy link
Collaborator

@dylwil3 dylwil3 commented Mar 8, 2025

When attempting to determine whether import foo.bar.baz is a known first-party import relative to user-provided source paths, when preview is enabled we now check that SRC/foo/bar/baz is a directory or SRC/foo/bar/baz.py or SRC/foo/bar/baz.pyi exist.

Previously, we just checked the analogous thing for SRC/foo, but this can be misleading in situations with disjoint namespace packages that share a common base name (e.g. we may be working inside the namespace package foo.buzz and importing foo.bar from elsewhere).

Supersedes #12987
Closes #12984

Testing

We've added some unit tests displaying the different behavior in the function match_sources which implements this logic. In addition, here are the results of an ad-hoc, local test and comparison with isort. Notice that the preview behavior now coincides with isort - including the inconsistent classification of from imports.

(I've manually edited out some of the irrelevant logs and such in the outputs)

tree
.
├── pyproject.toml
└── src
    └── mynamespace
        └── subpackage_a
            └── __init__.py../target/debug/ruff check src/mynamespace/subpackage_a/__init__.py --no-cache --select I -v
[2025-04-13][12:33:24][ruff_linter::rules::isort::categorize][DEBUG] Categorized 'mynamespace.subpackage_b' as Known(FirstParty) (SourceMatch("/Users/dmbp/Documents/dev/contrib/ruff/exampleproj/src"))
[2025-04-13][12:33:24][ruff_linter::rules::isort::categorize][DEBUG] Categorized 'mynamespace' as Known(FirstParty) (SourceMatch("/Users/dmbp/Documents/dev/contrib/ruff/exampleproj/src"))
All checks passed!../target/debug/ruff check src/mynamespace/subpackage_a/__init__.py --no-cache --select I -v --preview
[2025-04-13][12:33:30][ruff_linter::rules::isort::categorize][DEBUG] Categorized 'mynamespace.subpackage_b' as Known(ThirdParty) (NoMatch)
[2025-04-13][12:33:30][ruff_linter::rules::isort::categorize][DEBUG] Categorized 'mynamespace' as Known(FirstParty) (SourceMatch("/Users/dmbp/Documents/dev/contrib/ruff/exampleproj/src"))
src/mynamespace/subpackage_a/__init__.py:1:1: I001 [*] Import block is un-sorted or un-formatted
  |
1 | / import mynamespace.subpackage_b
2 | | from mynamespace import subpackage_b
  | |____________________________________^ I001
  |
  = help: Organize imports

Found 1 error.
[*] 1 fixable with the `--fix` option.uvx isort . --check --diff -v --src src

else-type place_module for mynamespace.subpackage_b returned THIRDPARTY
from-type place_module for mynamespace returned FIRSTPARTY
ERROR: /Users/dmbp/Documents/dev/contrib/ruff/exampleproj/src/mynamespace/subpackage_a/__init__.py Imports are incorrectly sorted and/or formatted.
else-type place_module for mynamespace.subpackage_b returned THIRDPARTY
from-type place_module for mynamespace returned FIRSTPARTY
--- /Users/dmbp/Documents/dev/contrib/ruff/exampleproj/src/mynamespace/subpackage_a/__init__.py:before	2025-04-13 12:29:49.050993
+++ /Users/dmbp/Documents/dev/contrib/ruff/exampleproj/src/mynamespace/subpackage_a/__init__.py:after	2025-04-13 12:36:28.299245
@@ -1,2 +1,3 @@
 import mynamespace.subpackage_b
+
 from mynamespace import subpackage_b

Summary of Ecosystem Checks

snowcli

These are all true positives! The package in this repo is snowflake.cli, and the ecosystem reports all have to do with importing third-party packages like snowflake.snowpark and snowflake.connector.

airflow

Also true positives: the dependency docs.utils.conf_constants appears in various places and is correctly identified as third-party, whereas before it was considered first-party because a docs folder.

freedomofpress

This is a false positive because this repo contains a Python file that does not have a .py suffix, but instead uses the shebang specification: https://github.com/freedomofpress/securedrop/blob/develop/journalist_gui/SecureDropUpdater

mlflow

The change is a true positive, but there is a leftover false positive. The change concerns two imports:

import docker
from docker.errors import APIError, BuildError

On main, these are both incorrectly identified as first-party due to a local folder named docker. In this PR with preview enabled, docker.errors is correctly identified as third-party, which causes a lint error to be emitted.

pytest

The problematic import here is from py.path import error which is, unfortunately, incorrectly labeled as third-party in preview. This is because it is not possible to know this import is first party from the file system alone! Indeed, path is actually a submodule of a module called _py, and then we have:

# src/py.py
import _pytest._py.path as path

If that's not bad enough, if you go to src/_pytest/_py you will not find path/error instead you will find path.py which has this line:

# path.py
from . import error

I don't think it's in scope to support something like that here! Maybe one day ty will power our import sorting!

@dylwil3 dylwil3 added bug Something isn't working rule Implementing or modifying a lint rule isort Related to import sorting labels Mar 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+22 -13 violations, +0 -0 fixes in 5 projects; 50 projects unchanged)

Snowflake-Labs/snowcli (+15 -13 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ scripts/cleanup.py:14:1: I001 [*] Import block is un-sorted or un-formatted
- src/snowflake/cli/_plugins/connection/util.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/logs/manager.py:1:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/logs/utils.py:1:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/nativeapp/sf_sql_facade.py:14:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/project/manager.py:14:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/spcs/compute_pool/manager.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/spcs/image_repository/manager.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/sql/manager.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/sql/snowsql_commands.py:1:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/_plugins/stage/diff.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/api/cli_global_context.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/api/connections.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/api/entities/common.py:1:1: I001 [*] Import block is un-sorted or un-formatted
+ src/snowflake/cli/api/sql_execution.py:15:1: I001 [*] Import block is un-sorted or un-formatted
+ test_external_plugins/snowpark_hello_single_command/src/snowflakecli/test_plugins/snowpark_hello/manager.py:14:1: I001 [*] Import block is un-sorted or un-formatted
- tests/nativeapp/utils.py:15:1: I001 [*] Import block is un-sorted or un-formatted
- tests/test_data/projects/glob_patterns/main.py:1:1: I001 [*] Import block is un-sorted or un-formatted
- tests/test_data/projects/glob_patterns/src/app.py:1:1: I001 [*] Import block is un-sorted or un-formatted
- tests/test_data/projects/glob_patterns_zip/main.py:1:1: I001 [*] Import block is un-sorted or un-formatted
- tests/test_data/projects/glob_patterns_zip/src/app.py:1:1: I001 [*] Import block is un-sorted or un-formatted
- tests/test_main.py:17:1: I001 [*] Import block is un-sorted or un-formatted
- tests_integration/nativeapp/test_debug_mode.py:15:1: I001 [*] Import block is un-sorted or un-formatted
- tests_integration/nativeapp/test_telemetry_errors.py:14:1: I001 [*] Import block is un-sorted or un-formatted
- tests_integration/notebook/test_notebooks.py:15:1: I001 [*] Import block is un-sorted or un-formatted
- tests_integration/spcs/testing_utils/image_repository_utils.py:1:1: I001 [*] Import block is un-sorted or un-formatted
- tests_integration/test_config.py:15:1: I001 [*] Import block is un-sorted or un-formatted
- tests_integration/testing_utils/sql_utils.py:15:1: I001 [*] Import block is un-sorted or un-formatted

apache/airflow (+4 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ airflow-core/docs/conf.py:21:1: I001 [*] Import block is un-sorted or un-formatted
+ chart/docs/conf.py:21:1: I001 [*] Import block is un-sorted or un-formatted
+ docker-stack-docs/conf.py:21:1: I001 [*] Import block is un-sorted or un-formatted
+ providers-summary-docs/conf.py:21:1: I001 [*] Import block is un-sorted or un-formatted

freedomofpress/securedrop (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ journalist_gui/test_gui.py:1:1: I001 [*] Import block is un-sorted or un-formatted

mlflow/mlflow (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ tests/projects/utils.py:50:5: I001 [*] Import block is un-sorted or un-formatted

pytest-dev/pytest (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ testing/_py/test_local.py:2:1: I001 [*] Import block is un-sorted or un-formatted

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
I001 35 22 13 0 0

Formatter (stable)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

mesonbuild/meson-python (error)

warning: Detected debug build without --no-cache.
error: Failed to read tests/packages/symlinks/baz.py: No such file or directory (os error 2)
error: Failed to read tests/packages/symlinks/qux.py: No such file or directory (os error 2)

Formatter (preview)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

mesonbuild/meson-python (error)

ruff format --preview

warning: Detected debug build without --no-cache.
error: Failed to read tests/packages/symlinks/baz.py: No such file or directory (os error 2)
error: Failed to read tests/packages/symlinks/qux.py: No such file or directory (os error 2)

@dylwil3 dylwil3 force-pushed the namespaces branch 2 times, most recently from 5cb8c06 to 2eb5cd5 Compare March 16, 2025 20:28
@dylwil3 dylwil3 changed the title [isort] Only infer subpackages of namespace packages as first-party [isort] Check full module path against project root(s) when categorizing first-party Mar 27, 2025
@dylwil3 dylwil3 marked this pull request as ready for review March 27, 2025 21:46
@dylwil3 dylwil3 requested a review from AlexWaygood March 27, 2025 21:46
@MichaReiser
Copy link
Member

I think that PR slipped our attention last week. At least I didn't notice it :)

@MichaReiser
Copy link
Member

MichaReiser commented Apr 1, 2025

Did you click through the ecosystem checks?

I'm torn but I think this should be gated behind preview. The ecosystem fallouts are significant enough.

@AlexWaygood
Copy link
Member

AlexWaygood commented Apr 1, 2025

I haven't forgotten about this. I'm still waging war against my notification queue after the on-site 😢

@dylwil3
Copy link
Collaborator Author

dylwil3 commented Apr 2, 2025

I think that PR slipped our attention last week. At least I didn't notice it :)
I haven't forgotten about this. I'm still waging war against my notification queue after the on-site 😢

No worries!

Did you click through the ecosystem checks?

I did quickly before, but going through them again I noticed two examples that need more investigation.

  • freedomofpress has a strange directory layout leading to a false positive, but it's possible that this is expected and they just need to specify a different tool.ruff.src
  • mlflow has a directory named docker with a Dockerfile in it... and imports a Python package named docker. This caused a silent error before where both imports docker and docker.errors were labeled as first party, but, since they were the only imports present, this had no effect on import sorting. But now we see that there is not docker/errors and label the second as third party.

I'm torn but I think this should be gated behind preview. The ecosystem fallouts are significant enough.

Good point - I'll try to sort out the least ugly way to do that...


Finally, I think I should try to fix the following limitation of the current approach. When we see

from foo import bar

the former approach would look for a foo directory or a foo.py file and call it a day. The present approach does the same - we don't look for a bar subdirectory because we don't know whether bar is supposed to be an imported object or module. In fact, one source of false positives in the ecosystem check, and the reason for introducing .module_source, was that categorize was being passed the fully qualified name foo.bar, failing to find foo/bar.py or foo/bar/, and then declaring the import third party. So I think I need to add some logic around this, otherwise we could get different categorizations for importing a submodule depending on whether you did

from foo import bar

or

import foo.bar as bar

I'm going to revert back to draft while I sort out all of the above. Sorry for the needless ping!

@dylwil3 dylwil3 marked this pull request as draft April 2, 2025 10:34
@dylwil3 dylwil3 force-pushed the namespaces branch 2 times, most recently from ca727da to 7ea88ab Compare April 13, 2025 15:39
@dylwil3 dylwil3 added the preview Related to preview mode features label Apr 13, 2025
@dylwil3 dylwil3 marked this pull request as ready for review April 13, 2025 17:39
@MichaReiser
Copy link
Member

I haven't looked at the code yet but could you update the PR summary or add a new comment that replies to your ecosystem findings and explains why we think they're correct. This will help us to remember the reasoning when writing the changelog in the next minor release

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to work on this. This looks good to me. I've a few smaller nit comments and I think we need to extend the documentation and the motivation for the change a bit before landing.

@dylwil3
Copy link
Collaborator Author

dylwil3 commented Apr 26, 2025

Note: There have been some other issues raised in #10519 but I think they are orthogonal to this fix (the issues are present both before and after the current change) so it's worth exploring solutions in a followup.

@AlexWaygood AlexWaygood removed their request for review May 5, 2025 13:04
@AlexWaygood
Copy link
Member

I basically lost all context on this because the work I did on it was such a long time ago :-(

if @MichaReiser is happy with this, I'm happy with it! I trust both of you to get it right 😃

@dylwil3 dylwil3 merged commit 965a4dd into astral-sh:main May 5, 2025
34 checks passed
AlexWaygood pushed a commit that referenced this pull request May 5, 2025
…izing first-party (#16565)

When attempting to determine whether `import foo.bar.baz` is a known
first-party import relative to [user-provided source
paths](https://docs.astral.sh/ruff/settings/#src), when `preview` is
enabled we now check that `SRC/foo/bar/baz` is a directory or
`SRC/foo/bar/baz.py` or `SRC/foo/bar/baz.pyi` exist.

Previously, we just checked the analogous thing for `SRC/foo`, but this
can be misleading in situations with disjoint namespace packages that
share a common base name (e.g. we may be working inside the namespace
package `foo.buzz` and importing `foo.bar` from elsewhere).

Supersedes #12987 
Closes #12984
Glyphack pushed a commit to Glyphack/ruff that referenced this pull request May 6, 2025
…izing first-party (astral-sh#16565)

When attempting to determine whether `import foo.bar.baz` is a known
first-party import relative to [user-provided source
paths](https://docs.astral.sh/ruff/settings/#src), when `preview` is
enabled we now check that `SRC/foo/bar/baz` is a directory or
`SRC/foo/bar/baz.py` or `SRC/foo/bar/baz.pyi` exist.

Previously, we just checked the analogous thing for `SRC/foo`, but this
can be misleading in situations with disjoint namespace packages that
share a common base name (e.g. we may be working inside the namespace
package `foo.buzz` and importing `foo.bar` from elsewhere).

Supersedes astral-sh#12987 
Closes astral-sh#12984
dylwil3 added a commit that referenced this pull request Sep 5, 2025
This stabilizes the behavior introduced in #16565 which (roughly) tries
to match an import like `import a.b.c` to an actual directory path
`a/b/c` in order to label it as first-party, rather than simply looking
for a directory `a`.

Mainly this affects the sorting of imports in the presence of namespace
packages, but a few other rules are affected as well.
ntBre pushed a commit that referenced this pull request Sep 8, 2025
This stabilizes the behavior introduced in #16565 which (roughly) tries
to match an import like `import a.b.c` to an actual directory path
`a/b/c` in order to label it as first-party, rather than simply looking
for a directory `a`.

Mainly this affects the sorting of imports in the presence of namespace
packages, but a few other rules are affected as well.
ntBre pushed a commit that referenced this pull request Sep 10, 2025
This stabilizes the behavior introduced in #16565 which (roughly) tries
to match an import like `import a.b.c` to an actual directory path
`a/b/c` in order to label it as first-party, rather than simply looking
for a directory `a`.

Mainly this affects the sorting of imports in the presence of namespace
packages, but a few other rules are affected as well.
ntBre pushed a commit that referenced this pull request Sep 10, 2025
This stabilizes the behavior introduced in #16565 which (roughly) tries
to match an import like `import a.b.c` to an actual directory path
`a/b/c` in order to label it as first-party, rather than simply looking
for a directory `a`.

Mainly this affects the sorting of imports in the presence of namespace
packages, but a few other rules are affected as well.
ntBre pushed a commit that referenced this pull request Sep 10, 2025
This stabilizes the behavior introduced in #16565 which (roughly) tries
to match an import like `import a.b.c` to an actual directory path
`a/b/c` in order to label it as first-party, rather than simply looking
for a directory `a`.

Mainly this affects the sorting of imports in the presence of namespace
packages, but a few other rules are affected as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working isort Related to import sorting preview Related to preview mode features rule Implementing or modifying a lint rule

Projects

None yet

3 participants