[red-knot] Initial tests for instance attributes#15474
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
3702bf9 to
462c6ab
Compare
462c6ab to
bcb7937
Compare
crates/red_knot_python_semantic/resources/mdtest/annotations/unsupported_type_qualifiers.md
Show resolved
Hide resolved
85cc979 to
92e5f00
Compare
92e5f00 to
a7cd7e2
Compare
|
|
||
| #### Variable declared in class body and defined in unrelated method | ||
|
|
||
| We also recognize pure instance variables if they are defined in a method that is not `__init__`. |
There was a problem hiding this comment.
Not saying we shouldn't but this will require traversing the entire class body. This could be somewhat expensive (at least it's more expensive than only doing so for __init__).
- Does that include accesses inside of nested classes, functions or lambdas?
- Does this include accesses where the variable isn't self
class C:
pure_instance_variable: str
def test(self, other: Self, value: str):
other.pure_instance_variable = valueThere was a problem hiding this comment.
Mypy and pyright both support this, so I think we have to, expensive though it may be. Users will expect their type checker to support this and will complain if it doesn't. It's also unfortunately reasonably common in Python to do things like
class Foo:
def __init__(self):
self.initialize_x_variable()
def initialize_x_variable(self):
# many lines of complicated logic
self.x = 42
Does that include accesses inside of nested classes, functions or lambdas?
Does this include accesses where the variable isn't self
I think we probably don't need to support these, certainly not initially
There was a problem hiding this comment.
We can also experiment with varying degrees of laziness here. For first-party code that we're actually checking, I think we'll probably need to know all instance attributes of every class. But for third-party/stdlib code that our first-party code is interacting with, it often probably isn't necessary to know what instance attributes the class has and, even if it is, it might not be necessary to materialize the full set of instance attributes the class has. For something like the following, we could plausibly short-circuit if we find the foo attribute defined in __init__ or __new__, and not bother analyzing the other methods the SomeClass defines:
from third_party import SomeClass
x = SomeClass()
print(x.foo + 5) # all we need to know here is whether `SomeClass` has a `foo` attribute
# and what its type is
AlexWaygood
left a comment
There was a problem hiding this comment.
not a full review yet, but gotta go for lunch!
|
|
||
| #### Variable declared in class body and defined in unrelated method | ||
|
|
||
| We also recognize pure instance variables if they are defined in a method that is not `__init__`. |
There was a problem hiding this comment.
Mypy and pyright both support this, so I think we have to, expensive though it may be. Users will expect their type checker to support this and will complain if it doesn't. It's also unfortunately reasonably common in Python to do things like
class Foo:
def __init__(self):
self.initialize_x_variable()
def initialize_x_variable(self):
# many lines of complicated logic
self.x = 42
Does that include accesses inside of nested classes, functions or lambdas?
Does this include accesses where the variable isn't self
I think we probably don't need to support these, certainly not initially
AlexWaygood
left a comment
There was a problem hiding this comment.
Great, this is fantastic! Good call on leaving out Final for now.
Two notable things that you haven't covered are that we need to infer an instance variable for any declaration in the class body, even if there is no binding for the variable in any method at all. We could have a disabled-by-default rule that warns if you try to access the variable when it's not bound in any method, but pyright's experience with such a rule has been very mixed; it's hard to get it right without lots of false positives:
class Foo:
x: intand we need to infer an instance variable for a variable that is defined in a non-__init__ variable but not declared in a class body:
class Foo:
def __init__(self):
self.initialize_x()
def initialize_x(self):
self.x = 42 # TODO: should the attribute have `Unknown`, `int`, `Literal[42]` or `Unknown | Literal[42]`
# from the perspective of code from other scopes?
AlexWaygood
left a comment
There was a problem hiding this comment.
Looks great -- though I'd still personally add the cases I mentioned in #15474 (review) prior to landing
|
Depending on how exhaustive you're looking to be for class Foo:
def __init__(self):
self.x: ClassVar[str] = "x" |
I was planning to, you are too fast Alex 😄. I think those should be covered now. I actually changed a previous test to cover the "not declared in body, only bound in non- |
I'll add it to the list in the PR description. |
|
|
||
| c_instance = C(1) | ||
|
|
||
| # TODO: should be `Literal["value set in __init__"]` (or `str` which would probably be more generally useful) |
There was a problem hiding this comment.
This is an interesting question. Both pyright and mypy have heuristics to widen literal types when they think it's "probably more useful." This seems a bit unprincipled to me, but I'm open to the possibility that we'll also need to do it. I guess my question is, if the only assignment to self.pure_instance_variable1 that occurs anywhere in the class is self.pure_instance_variable1 = "value set in __init__", then why wouldn't you want it typed as Literal["value set in __init__"]? Just to accommodate external code setting it to some other value? That seems like an edge case, and not too much to ask you to add an explicit : str if that's what you wanted.
There was a problem hiding this comment.
if the only assignment to
self.pure_instance_variable1that occurs anywhere in the class isself.pure_instance_variable1 = "value set in __init__", then why wouldn't you want it typed asLiteral["value set in __init__"]? Just to accommodate external code setting it to some other value?
Subclasses also won't be able to assign a type to the attribute unless the type is assignable to the Literal[...] type
That seems like an edge case
Hmm, I don't agree. I think it's quite common for code to only assign an instance variable in the __init__ method of the class, but with the intention that it should be possible to reassign the attribute from other scopes.
and not too much to ask you to add an explicit
: strif that's what you wanted.
Surely that violates the gradual guarantee? And this doesn't seem to tackle at all the problem of typed code interacting with untyped third-party code, which may be unwilling to add type annotations in a timely manner?
I'm leaning towards either Unknown | Literal[...] or Unknown | str for cases like these. Haven't made my mind up about which is preferable there (haven't thought too deeply about it yet!)
There was a problem hiding this comment.
Surely that violates the gradual guarantee
Yes, but so does widening the literal type to str. That's what I mean by "unprincipled" -- it doesn't fundamentally change anything, just arbitrarily chooses some wider type because we think it's "probably what you meant."
I like the idea of a union with Unknown for inferred attribute types! I think that is the principled gradual-guarantee approach here. It will be more forgiving than what people are used to, but I generally like it if we take the gradual guarantee more seriously than existing type checkers: if you want more strictness, annotate.
(I don't see why we would special-case converting it to Unknown | str instead of Unknown | Literal[...] if we're unioning with Unknown -- I think the union already takes care of the problematic aspects of the literal inference, by allowing you to assign anything.)
| # mypy and pyright do not show an error here. | ||
| reveal_type(c_instance.pure_instance_variable5) # revealed: @Todo(instance attributes) | ||
|
|
||
| c_instance.pure_instance_variable1 = "value set on instance" |
There was a problem hiding this comment.
Depending on how we answer the above question, arguably this should be an error
| # TODO: should ideally be `Literal["value set on instance"]` | ||
| # (due to earlier assignment of the attribute from the global scope) | ||
| reveal_type(c_instance.pure_instance_variable1) # revealed: @Todo(instance attributes) |
There was a problem hiding this comment.
I think there is some question about the extent to which we'll want to do this narrowing. It's unsound in general (because we can't really say with any certainty what happens to c_instance between that assignment and this check), and particularly likely to be unsound in global scope. But we'll probably need to do some of it.
There was a problem hiding this comment.
We've had this conversation several times now 😆 I still favour the more pragmatic behaviour of mypy and pyright over the stricter behaviour of pyre here. But yeah, we can debate the details at a later stage
|
|
||
| # for a more realistic example, let's actually call the method | ||
| c_instance.set_instance_variable() |
There was a problem hiding this comment.
Probably the even more realistic example would have this method called from __init__, but if we do that in the test it raises more questions about whether we're trying to detect that call...
## Summary Some clarifications in the instance-attributes tests, mostly regarding type inference behavior following this discussion: #15474 (comment)
Summary
Adds some initial tests for class and instance attributes, mostly to document (and discuss) what we want to support eventually. These tests are not exhaustive yet. The idea is to specify the coarse-grained behavior first.
Things that we'll eventually want to test:
Finalin addition toClassVarClassVar, like making sure that we support things likex: Annotated[ClassVar[int], "metadata"]__new__in addition to the tests for__init__@propertyTest Plan
New Markdown tests