-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
TYP: type ma.min #26624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: type ma.min #26624
Conversation
|
I'd appreciate a little help here. This is my first time reading numpy's typing system.
I'm quite confident that once I got the resources I need I can keep typing the other functions by myself |
|
Any updates on this? |
Jacob-Stevens-Haas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm working on #26081 and while not a numpy dev, I can hopefully ask and answer some questions.
np.aminhas three overloads to tell the type checker three different cases- with no axis argument,
aminwill return the argmin of the entire array as a python (not numpy) object, so tell the type checker that. - With an axis argument and no
out, it's impossible to tell what_ArrayLikeobjectais, so tell type checkerAny - With an argument
out, we know that the returned variable will beout, so the type checker knows the return type matchesout
These are probably also relevant for masked arrays.
- with no axis argument,
- I don't think you need more documentation, because there's no new types, other than an entry in doc/release/upcoming_changes/26624.improvement.rst.
- Yes, most likely. A file numpy/typing/tests/data/pass/masked.py and likewise .../fail/masked.py that will fill up with type checks as #26404 is implemented. This is a good place to test-drive assumptions you might have about how your type annotations work as you develop the PR (e.g. the comment on
axisandobjreturn types)
numpy/ma/core.pyi
Outdated
| axis: None | _ShapeLike = ..., | ||
| out: None = ..., | ||
| fill_value: None | _ScalarLike_co = ..., | ||
| keepdims: None | bool = ..., | ||
| ) -> _SCT: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if axis is _ShapeLike and not None, the return type will be the same type as obj, not _SCT.
|
@Jacob-Stevens-Haas thank you for the feedback. I'll be working on this next week. |
|
The test failure can be ignored. |
|
Need rebase. |
|
Rebased and upated with @Jacob-Stevens-Haas hints. Also added passing typing tests. I couldn't understand how to make failing tests. When calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is great, a few comments about tests (with a grain of salt - I'm learning too):
I couldn't understand how to make failing tests.
Fail tests are written with arguments that fail type checking, with a comment afterwards for the expected type checking error (mostly either "no overload variant" or "argument has incompatible type" e.g.
numpy/numpy/typing/tests/data/fail/fromnumeric.pyi
Lines 126 to 130 in d52816a
| np.amin(a, axis=1.0) # E: No overload variant | |
| np.amin(a, keepdims=1.0) # E: No overload variant | |
| np.amin(a, out=1.0) # E: No overload variant | |
| np.amin(a, initial=[1.0]) # E: No overload variant | |
| np.amin(a, where=[1.0]) # E: incompatible type |
Edit: But thinking about it more, there's no illegal combination I can think of worth testing. E.g. the overloads allow np.min(1.0) and np.min(arr, axis=0). If they prevented something like np.ma.min(1.0, axis=0), that would be a good fail test. I don't think there are any such holes in the overloads for this PR though, so fail tests may not be necessary
There's also reveal tests that verify the type that gets returned with assert_type. My mental comparison is that pass/fail tests check arguments, reveal tests check return value. In this case, it might be helpful to check the returned type is not Any in overloads 1 and 3.
When calling spin mypy somehow it kept throwing me old errors even after deleting the test file.
Did you try running $ git clean -xdf?
numpy/typing/tests/data/pass/ma.py
Outdated
| m : np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.masked_array([1.5, 2, 3], mask=[True, False, True]) | ||
| m: np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.masked_array([1.5, 2, 3], mask=[True, False, True]) | ||
|
|
||
| A = np.array(True, ndmin=2, dtype=bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm no expert on which tests to write for pass/, but this may be more than is required? IIUC, the PR doesn't make a distinction between bool, np.uint8, np.float32, np.uint8, so does the distinction between A/B and C/D really test anything? (happy to change my mind though!)
numpy/typing/tests/data/pass/ma.py
Outdated
| A.setflags(write=False) | ||
| B.setflags(write=False) | ||
| C.setflags(write=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that setflags is called in fromnumeric.py, but does it do anything important for the test?
| np.ma.min(A, axis=0) | ||
| np.ma.min(B, axis=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, axis being None, an int, or _ShapeLike does make a difference. It might make sense to also run this with axis=(0,)
|
@Jacob-Stevens-Haas I updated the PR with your suggestions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this! Can you add a reveal test to check the return values? I.e. numpy/typing/tests/data/reveal/ma.pyi like:
import numpy as np
arr: np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.masked_array([[1.0, 2.0]], mask=[True, False])
result = np.ma.min(arr, axis=None)
assert_type(result, np.float64)
tgt: np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.empty(shape=())
result = np.ma.min(arr, axis=None, out=tgt)
assert_type(result, np.ma.MaskedArray)This verifies that the result type changes with the out parameter. Since reveal tests even need to be executable, so you can remove the assignment for readability.
Co-Authored-By: Jakob Stevens Haas <[email protected]>
Co-Authored-By: Jakob Stevens Haas <[email protected]>
|
Both my I pushed this test and it failed. I'll try to make |
|
Hmm, this is curious.
In other words, The long and short of this is that you can probably include |
jorenham
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
I left a couple of minor suggestions, but in general, I like the way this looks 🙁.
| def min(obj, axis=..., out=..., fill_value=..., keepdims=...): ... | ||
| @overload | ||
| def min( | ||
| obj: MaskedArray[_ShapeLike, _SCT] | _ArrayLike[_SCT], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shape type is bound to tuple[int, ...], to which _ShapeLike isn't assignable (i.e. it's a broader type).
Additionally, the second type parameter of MaskedArray is bound to np.dtype, but _SCT is bound to np.generic.
Luckily this is easy to fix in this case, because MaskedArray is assignable to _ArrayLike (for a specific scalar type _SCT). So that means we simply write it as:
| obj: MaskedArray[_ShapeLike, _SCT] | _ArrayLike[_SCT], | |
| obj: _ArrayLike[_SCT], |
| axis: None = ..., | ||
| out: None = ..., | ||
| fill_value: None | _ScalarLike_co = ..., | ||
| keepdims: Literal[False] = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this was discussed earlier, but keepdims uses a default sentinel, and it's impossible to type it.
However, that's not a problem at all. Because from a typing perspective, not passing keepdims is indistinguishable from keepdims=False. So we can just pretend that Literal[False] is the default.
So in other words; this overload is perfectly fine as-is.
| out: None = ..., | ||
| fill_value: None | _ScalarLike_co = ..., | ||
| keepdims: Literal[False] = ..., | ||
| ) -> _SCT: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the array has an object_ as scalar-type, then this implies that an instance of np.object_ is returned. However, those don't exist. And because we don't know what the actual type is "within" that object_, the best we could do is to return an object, and let the user cast it to the specific type.
But having said that, I don't think that it's really a priority at the moment. So I personally wouldn't mind if you'd label the returned object_ as a "proxy type", and keep it the way it is 🤷🏻.
Plus, a little bird told me that object_ might become a generic type in the near future 😏.
|
|
||
| @overload | ||
| def min( | ||
| obj: MaskedArray[_ShapeLike, _SCT] | _ArrayLike[_SCT], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| obj: MaskedArray[_ShapeLike, _SCT] | _ArrayLike[_SCT], | |
| obj: _ArrayLike[_SCT], |
| out: None = ..., | ||
| fill_value: None | _ScalarLike_co = ..., | ||
| keepdims: None | bool = ..., | ||
| ) -> Any: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this is in line with the current annotations of numpy.min, but we could also narrow this down do _SCT | MaskedArray[tuple[int, ...], np.dtype[_SCT]] while we're at it 🤷🏻.
Like I mentioned before, this technically this isn't correct in the case of object_ (and StringDType as well, BTW). But because this is already an improvement in general, I think it's not too big of a problem if we just ignore this for now.
|
|
||
| from typing_extensions import assert_type | ||
|
|
||
| arr: np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.masked_array([[1.0, 2.0]], mask=[True, False]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no need to do the actual function call within a .pyi stub, especially since the return type of np.ma.masked_array was already verified in the "pass" tests.
| arr: np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.masked_array([[1.0, 2.0]], mask=[True, False]) | |
| arr: np.ma.MaskedArray[Any, np.dtype[np.float64]] |
| result = np.ma.min(arr, axis=None) | ||
| assert_type(result, np.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by inlining the result, you avoid changing the type result later on, which technically isn't valid (mypy allowing this is a bug IMHO).
| result = np.ma.min(arr, axis=None) | |
| assert_type(result, np.float64) | |
| assert_type(np.ma.min(arr, axis=None), np.float64) |
| tgt: np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.empty(shape=()) | ||
| result = np.ma.min(arr, axis=None, out=tgt) | ||
| assert_type(result, np.ma.MaskedArray) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of tgt is equivalent to that of arr, so there's no need for it (so it doesn't matter that at runtime this wouldn't make a lot of sense).
Additionally, np.ma.MaskedArray should have type arguments, as its type parameters have no defaults (PEP 696).
| tgt: np.ma.MaskedArray[Any, np.dtype[np.float64]] = np.ma.empty(shape=()) | |
| result = np.ma.min(arr, axis=None, out=tgt) | |
| assert_type(result, np.ma.MaskedArray) | |
| assert_type(np.ma.min(arr, axis=None, out=arr), np.ma.MaskedArray[Any, np.dtype[np.float64]]) |
| zeros_like as zeros_like, | ||
| angle as angle | ||
| ) | ||
| from numpy._typing import _ArrayLike, _ScalarLike_co, _ShapeLike, overload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably safer to import overload directly from typing instead.
|
|
||
| _ShapeType_co = TypeVar("_ShapeType_co", bound=tuple[int, ...], covariant=True) | ||
| _DType_co = TypeVar("_DType_co", bound=dtype[Any], covariant=True) | ||
| _ArrayType = TypeVar("_ArrayType", bound="MaskedArray[Any, Any]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for stringified annotations within .pyi stubs (and disallowed by e.g. flake8-pyi and several type-checkers)
|
I see that there's also a (tiny) merge conflict. |
|
This has been superseded by #28593, so I'm going to close this. |
Add static typing to
ma.min. Part of #26404.