-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
A broad discussion on how handling __array_function__ mixin should work started in #4567. I suggest this continues here.
Some highlights:
@shoyer said:
The
__array_function__protocol guarantees that it will only get called ifdask.array.Arrayis in types. So if that's all you wanted to check, you could drop this entirely.The problem is that you want to support operations involving some but not all other array types. For example,
xarray.DataArraywraps dask arrays, not the other way around. Sodask.Array.__array_function__should returnNotImplementedin that case, leaving it upxarray.DataArray.__array_function__to define the operation.We basically need some protocol or registration system to mark arrays as "can be coerced into dask array", e.g., either
sparse.Array.__dask_chunk_compatibile__ = Trueor
dask.array.register_chunk_type(sparse.Array)
@hameerabbasi said:
I guess Dask could safely coerce anything that implements
InMemoryandNumpyDuckArraymixins?I was thinking of three classes of (NumPy-Specific) mixins, aimed at implementing what I see as the three major protocols:
NumPyUfuncMixinNumPyIndexingMixin(for NumPy-like indexing, soXArrayandpandaswould skip this).NumPyArrayFunctionMixin(for implementing array methods via__array_function__, such assumandmean).I see the use of your
InMemorymixin, but I would rename it toInCore, as GPU computations don't take place in main memory. :)For now, these would be provisional, just like
__array_function__. The hope is that these would be adopted and specific checking code would go away.
Another @hameerabbasi comment:
Then how about a "numpy/pydata community repo" containing these mixins? I would be happy to maintain one. (Not in the main NumPy codebase)? The reason I like the idea of mixins is it allows "duck-array" like objects to easily implement NumPy-like functionality without too much effort.
We can also have protocols:
class DuckArray: __array_indexing__ = True __array_in_core__ = True