-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
Description
Currently inverse indices are always(?) a 1-D array. This means input arrays with !=1 ndims cannot just be reconstructed by indexing the unique values with the inverse indices e.g. in the docs we have the following reconstruction example
>>> a = np.array([1, 2, 6, 4, 2, 3, 2])
>>> u, indices = np.unique(a, return_inverse=True)
>>> u[indices]
array([1, 2, 6, 4, 2, 3, 2])but we can't do the same for 0-D/2+ dimensional arrays
>>> a = np.array([[1, 2, 6], [2, 3, 2]])
>>> u, indices = np.unique(a, return_inverse=True)
>>> u[indices]
array([1, 2, 6, 2, 3, 2]) # shape is not 2-DRight now you need to use reshape() to actually reconstruct the original array
>>> u[indices.reshape(a.shape)]
array([[1, 2, 6],
[2, 3, 2]])So IMO it would be a nice usability change for these inverse indices to share the input array shape so you can just do
>>> u[indices]
array([[1, 2, 6],
[2, 3, 2]])Another benefit is that you will fix a current bug in array_api.unique_inverse/array_api.unique_all, as the Array API spec specifies inverse indices should indeed share the same shape as the input array. Admittedly that can easily be solved with an internal reshaping. cc @asmeurer
I wonder if there's some reasoning I'm missing for always returning 1-D arrays. Maybe integer indexing came after return_inverse was added. And whilst the shape of the returned inverse indices is not documented, I wonder if there are unintended consequences to changing this behaviour for downstream libraries.