Skip to content

Conversation

@seberg
Copy link
Member

@seberg seberg commented Sep 12, 2019

The current commit uses ... (Ellipsis) instead of a leave_wrapped singleton, which is short to write, but other than
that does not have much reason going for. So we should probably
change that.

xref gh-13105

@seberg seberg added 01 - Enhancement 54 - Needs decision 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. labels Sep 12, 2019
@seberg seberg force-pushed the ufunc-no-scalar-unpacking branch from 86dc281 to 445c31d Compare September 12, 2019 20:31
@eric-wieser
Copy link
Member

Would be great if we could add this for reductions too - we don't want to end up in a limbo where leave_wrapped / Ellipsis is accepted by some out arguments but not others.

@seberg
Copy link
Member Author

seberg commented Sep 13, 2019

Well, I am still not sure if I don't think that arr.sum() should return a scalar, while array_1d.sum(1) should not. I realize that most users will probably be confused by that. Although, that doesn't mean our singleton cannot mean keep/make array here even when we might want to change the normal behaviour at some point.

The change itself is going to be easy though, I think.

@eric-wieser
Copy link
Member

Although, that doesn't mean our singleton cannot mean keep/make array here even when we might want to change the normal behaviour at some point.

I'm not suggesting we change the default any time soon, just that we give users the option to leave things wrapped for reductions too.

@seberg
Copy link
Member Author

seberg commented Sep 13, 2019

Yeah, I was considering for a moment if ... should only ensure an array if axis!=None. But probably there is no point in that, since the main point of this is to make things easily predictable, and that is simplest if an array is guaranteed.

@seberg
Copy link
Member Author

seberg commented Sep 13, 2019

Hmmmpf, there is an issue. Implementors of __array_wrap__ (mainly/only memmap maybe), may create a scalar. The context does contain the Ellipsis, so subclasses could fix it up. But that is currently not true for reductions (which do not give a context).

The other option would be to guarantee an array (EDIT: read np.ndarray type) return (unless there is an override with __array_ufunc__.

@seberg
Copy link
Member Author

seberg commented Sep 13, 2019

@mhvk since I migrated a bit from the issue. just in case you are interested. Reduction is not an issue of course. The main issue is that __array_wrap__ of subclasses may still unpack the scalar, so the question would be if we should just not wrap at all...

@seberg seberg force-pushed the ufunc-no-scalar-unpacking branch from d490a16 to 2d98a62 Compare September 13, 2019 20:27
@eric-wieser
Copy link
Member

The main issue is that array_wrap of subclasses may still unpack the scalar,

I think if this happens, we should just throw an exception - since __array_wrap__ is only called on array subclasses, it should be sufficient to check if the return value is still an ndarray subclass - if not, then the __array_wrap__ didn't comply with the requested out argument - something like raise NotImplementedError("Subclass does not appear to support `out=leave_wrapped`")

@hameerabbasi
Copy link
Contributor

I'd like to add this to triage review.

@seberg seberg added the triage review Issue/PR to be discussed at the next triage meeting label Feb 26, 2020
@seberg
Copy link
Member Author

seberg commented Feb 26, 2020

We could also use np.ndarray as "singleton" and enforce NumPy array output (no subclasses)?

@mattip mattip added triaged Issue/PR that was discussed in a triage meeting and removed 54 - Needs decision triage review Issue/PR to be discussed at the next triage meeting labels Feb 26, 2020
@eric-wieser
Copy link
Member

Now that's an interesting idea. I almost wonder if we want to allow passing any array subclass in there.

@seberg
Copy link
Member Author

seberg commented Feb 26, 2020

@eric-wieser well, we could initially only support ndarray. The thing I think that is missing is to figure out where we want to use this right now. I think it may be that almost(?) all places where we would be using this internally, we already make sure we are wokring with arrays.

But yes, I like the idea as such to use the type... The issue is that we have no control over what subclass.__array_wrap__ does, and if someone would pass in subclass we want to prevent getting a scalar back from __array_wrap__, which is tricky.

@mhvk
Copy link
Contributor

mhvk commented Feb 27, 2020

Passing in np.ndarray sounds OK. Right now, all it would do is prevent the call that turns array scalars into arrays, so adding other subclasses would introduce a bit of extra code, but I can see that it might be useful inside __array_ufunc__ implementations (replace any empty output with the class you want and no need for a .view after), though I'm not sure it helps clarity much. Maybe best to keep that for possible later follow-up?

@mhvk
Copy link
Contributor

mhvk commented Jun 17, 2020

np.ma.array we can of course adjust, but I see your point, that if we start using this internally in numpy functions, then we may break subclasses which wrap, explicitly cast to scalar, and now some numpy functions do not work with them any more. So, that suggests option (2), of passing on a context, raising a DeprecationWarning if __array_wrap__ returns a scalar, and either let things possibly fail or temporarily recast any numpy scalars returned by __array_wrap__ to an ndarray internally.

In this respect, perhaps the original Ellipses was the better choice after all, since then one can describe the functionaility as out=... meaning that the code will behave as if as if the call is replaced by ufunc.method(*args, **kwargs)[...]. The fact that internally it means that in a case where we exactly know what the answer will be and thus not do the cast to a scalar is then an implementation detail. And with wrapping, one could exactly do what is described. Of course, that could still cause failures, but perhaps less likely (and if the context is passed on, fairly easy to solve).

@eric-wieser
Copy link
Member

eric-wieser commented Jun 17, 2020

And with wrapping, one could exactly do what is described

I think the point you're missing is that __array_wrap__ is not passed np.ndarray or ... - it's passed the already-allocated and computed out array - so there's no way to know whether the caller was asking for array-preservation. Without augmenting the argument set to __array_wrap__, it is impossible for a subclass to behave.

@mhvk
Copy link
Contributor

mhvk commented Jun 17, 2020

Hmm, indeed. So I guess your solution makes more sense. I do wonder whether instead of creating a new keyword argument for what is a niche case instead always pass on a context, also for ufunc reductions, and include out in there. (Sorry if this was discussed; I read through the thread but may have missed it)

@seberg
Copy link
Member Author

seberg commented Jun 18, 2020

@eric-wieser triage marking is nice, but I need a bit more head-start to bring it up (and it would be best if you are around too). I still have to get to wrapping my head around the options, I guess the only way to support array-wrap is to add the argument. Which was why I think the PR basically used np.ndarray to indicate that subclasses are never wrapped, and why I agree that ... may be better if you indicate that subclasses (and maybe for __array_ufunc__ any array-like) is OK, but no scalar.

@seberg seberg removed the triage review Issue/PR to be discussed at the next triage meeting label Aug 25, 2020
@mattip
Copy link
Member

mattip commented Nov 12, 2020

@seberg is this meant to be part of the ufunc reworking for 1.20 or do we let it slide to a future release?

@seberg
Copy link
Member Author

seberg commented Nov 12, 2020

Let it slide, it might create ufunc rework merge conflicts, but is an unrelated feature. Should look at it and see if I can figure out how subclasses should be dealt with though :/.

Base automatically changed from master to main March 4, 2021 02:04
@charris
Copy link
Member

charris commented Apr 12, 2021

@seberg It looks like your recent work leads to a conflict.

@charris
Copy link
Member

charris commented Apr 21, 2021

@seberg Needs rebase.

@charris
Copy link
Member

charris commented Feb 20, 2023

@seberg Still interested in this? Needs rebase.

@mhvk
Copy link
Contributor

mhvk commented Feb 20, 2023

It sounds like there is some serious thought about a 2.0. Maybe in that case we can just start to return array scalars instead of proper scalars? I.e., start to omit the cast to scalar? Apart from being mutable, the difference seems rather small.

@mhvk
Copy link
Contributor

mhvk commented Apr 15, 2023

Does numpy 2.0 include deprecating scalars in favour of array scalars? If so, this becomes somewhat moot, but I didn't see it on the roadmap

@seberg
Copy link
Member Author

seberg commented Apr 16, 2023

I don't want to aim for it. For me there are still quite a lot of unknowns and probably some exploration needed about how feasible it is.
(And yes, I tend to think that removing scalars everywhere may well leave a weird gap also, but then maybe I am just not bold enough about forcing change. I am sure there are things that would be great to change around scalars, I do dislike most things about them.)

@mhvk
Copy link
Contributor

mhvk commented Apr 16, 2023

In that case, it is perhaps worth revising this PR?

@seberg
Copy link
Member Author

seberg commented Apr 17, 2023

Yes, likely. Still haven't had the spark of the original idea how to deal with subclasses. We could also abuse subok=True instead of out=.

An alternative thing that would be a big change, but not as big as removing scalars one, would be this: https://github.com/seberg/numpy/pull/new/try-0d-preservation that I tried to explore a bit today.

The proposal would be to:

  • Ensure that 0-d arrays are preserved in ufuncs, np.add(0d_array, 2) will return a 0d array. (And all other functions, this only has ufuncs right now!)
  • Reductions would return scalars if axis=None.
  • We may or may not need a (internal) helper arr, wrap = asarray_and_wrap(obj) that returns a function to do the "decaying".

(The tail of that is not super trivial, quite a few test failures, scipy also relies on scalars in a few places for example, but it looks harmless). But run that branch with:

NPY_NUMPY_2_BEHAVIOR=1

to try. A fun little caveat: np.matmul(vector, vector) should be a scalar so is a bit hackish still using the old behavior due to that.

@mhvk
Copy link
Contributor

mhvk commented Oct 8, 2023

This would still be nice to have, though I now wonder if numpy 2.0 isn't a good opportunity to just change it -- really, if one gives arrays as input, one should get arrays as output, even if the arrays happen to have shape ()... It will simplify quite a bit of code...

@InessaPawson InessaPawson removed the triaged Issue/PR that was discussed in a triage meeting label Apr 8, 2024
@mattip
Copy link
Member

mattip commented Sep 4, 2024

We discussed this in a triage meeting and decided to close this but leave the issue open, maybe we will revisit for 3.0

@mattip mattip closed this Sep 4, 2024
@seberg
Copy link
Member Author

seberg commented Sep 5, 2024

While I don't mind closing this, it has nothing to do with a 3.0 release. That would b changing the behavior, which I think we can and should also pull off.

seberg added a commit to seberg/numpy that referenced this pull request Mar 24, 2025
As of now a WIP, because it would be nice to see what `_methods.py`.

The API change is to allow `out=...`, for multi-argument outputs it
is currently `out=(..., ...)`, although I guess we should allow
short-handing that (nobody needs to pick it per argument).

There is some discussion in numpygh-14489 about this.  I still think `...`
is the best, because `out=np.ndarray` doesn't generalize for
`__array_ufunc__`, and yes, I like `...` because it also indicates an
array return in indexing.  (`arr[i, ...]` always returns an array.)

---

The ufunc part works (not sure if 100% complete), but the `_methods.py`
changes do not (and change behavior subtlely, if possibly correctly).
The actual change is trivial (after I had refactored this a while ago).

I think we should do this, but the fact that the `_methods.py` changes
are not trivial makes me think there may be some idea missing.

It may be that the solution is for these functions to use conversion
helper once at the start and then convert to scalars based on that at the
end though (rather than dragging around subclasses).
mhvk pushed a commit to seberg/numpy that referenced this pull request Mar 26, 2025
As of now a WIP, because it would be nice to see what `_methods.py`.

The API change is to allow `out=...`, for multi-argument outputs it
is currently `out=(..., ...)`, although I guess we should allow
short-handing that (nobody needs to pick it per argument).

There is some discussion in numpygh-14489 about this.  I still think `...`
is the best, because `out=np.ndarray` doesn't generalize for
`__array_ufunc__`, and yes, I like `...` because it also indicates an
array return in indexing.  (`arr[i, ...]` always returns an array.)

---

The ufunc part works (not sure if 100% complete), but the `_methods.py`
changes do not (and change behavior subtlely, if possibly correctly).
The actual change is trivial (after I had refactored this a while ago).

I think we should do this, but the fact that the `_methods.py` changes
are not trivial makes me think there may be some idea missing.

It may be that the solution is for these functions to use conversion
helper once at the start and then convert to scalars based on that at the
end though (rather than dragging around subclasses).
seberg added a commit that referenced this pull request Apr 2, 2025
* API,ENH: Allow forcing an array result in ufuncs

This allows using `out=...` for ufuncs (including reductions) to enforce an array result.
With it, the result will never be converted to a scalar, but subclasses are preserved
normally.
(A subclass might still misbehave and convert to a scalar anyway, but it would be a bug there.)

There is some discussion in gh-14489 about this.  I still think `...`
is the best, because `out=np.ndarray` doesn't generalize for
`__array_ufunc__`, and yes, I like `...` because it also indicates an
array return in indexing.  (`arr[i, ...]` always returns an array.)

It uses this in a few places, although there are many more, e.g. `_methods.py`
should probably use it, but it is very subtle there.
(Also because some paths are probably subtly wrong for object dtype.)

---------

Signed-off-by: Sebastian Berg <[email protected]>
Co-authored-by: Marten H. van Kerkwijk <[email protected]>
Co-authored-by: Nathan Goldbaum <[email protected]>
MaanasArora pushed a commit to MaanasArora/numpy that referenced this pull request Apr 11, 2025
* API,ENH: Allow forcing an array result in ufuncs

This allows using `out=...` for ufuncs (including reductions) to enforce an array result.
With it, the result will never be converted to a scalar, but subclasses are preserved
normally.
(A subclass might still misbehave and convert to a scalar anyway, but it would be a bug there.)

There is some discussion in numpygh-14489 about this.  I still think `...`
is the best, because `out=np.ndarray` doesn't generalize for
`__array_ufunc__`, and yes, I like `...` because it also indicates an
array return in indexing.  (`arr[i, ...]` always returns an array.)

It uses this in a few places, although there are many more, e.g. `_methods.py`
should probably use it, but it is very subtle there.
(Also because some paths are probably subtly wrong for object dtype.)

---------

Signed-off-by: Sebastian Berg <[email protected]>
Co-authored-by: Marten H. van Kerkwijk <[email protected]>
Co-authored-by: Nathan Goldbaum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

01 - Enhancement 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants