Skip to content

Proper handling of __array_*__ attributes/methods #924

@jthielen

Description

@jthielen

After #905 is merged, the following NumPy array protocol attributes/methods will be explicitly handed by Pint Quantities:

  • __array_priority__
  • __array_ufunc__
  • __array_function__

All others (any starting with __array_) are implicitly handled by

  1. Warning with a UnitStrippedWarning
  2. If magnitude is an ndarray, return the corresponding attribute on the magnitude
  3. If the magnitude is not an ndarray, return the corresponding attribute on the magnitude converted to an ndarray.

Before #905, step 3 above was extremely problematic, since the conversion was done in place (see #399, #481, #509). While now it's a bit better because it should no longer mutate the Quantity, the conversion to ndarray may not respect the behavior of arbitrary duck arrays (see #845 / #878), and may return misleading values (such as may be the case with __array_struct__ and __array_interface__ as brought up in #905).

Being uncomfortable with this implicit handing, and seeing the issues it has caused, I would suggest that Pint Quantities instead only (explicitly) respond to:

and raise AttributeError for all other __array_*__ attributes. This would bring Pint into alignment with the recommendations set by NumPy for custom array containers (see https://docs.scipy.org/doc/numpy-1.17.0/user/basics.dispatch.html) and, to the best I can tell, the example set by Dask and Sparse (and to a certain extent CuPy).

However, while it seems to be the "more correct" implementation, given that this would be removing a fallback, it would be a major breaking change with potentially unforeseen consequences. And so, I wanted to reach out for feedback before I try putting together a follow-up PR to #905 to address these issues and work towards resolving #845 / #878.

If this is an acceptable change, would it make sense to include a deprecation cycle, by leaving in the __array_*__ fallback with a DeprecationWarning for Pint v0.10, and then changing to AttributeError in v0.11? Or is there a consensus to leave the current behavior in place and try working around the issues that arise?


For reference, for the three duck array packages pointed to in NEP 30 (Dask, CuPy, and Sparse), here is a brief summary of their behavior with these attributes from what I could scour from their source code (no guarantee on correctness though, as I haven't worked with any of their internals):

Dask and Sparse

  • Explicitly defines
    • __array__
    • __array_priority__
    • __array_ufunc__
    • __array_function__
  • Does not respond to others

CuPy

Metadata

Metadata

Assignees

No one assigned

    Labels

    numpyNumpy related bug/enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions