-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
If an xray.Dataset consists of dask arrays, it should work like a first class dask collection for integration with dask.imperative. That is, when you call a delayed function (constructed with dask.imperative.do) on a xray Dataset consisting of dask arrays, dask should merge the task graphs when it executes.
The logic in dask.imperative that does this currently checks if the object inherits from the dask collection base class:
https://github.com/blaze/dask/blob/7e9bff047894c2bc9539370898dc135508739d38/dask/imperative.py#L67-L72
Strictly speaking, xray objects aren't always dask collections (dask isn't even a required dependency), so some type of duck typing solution seems appropriate. What should this look like?
The simplest way to adapt existing code would be to check for dask, _keys, _optimize and _finalize attributes, but I would prefer more specific names for the private methods to make it clearer what they refer to. I'm happy with a .dask attribute (the presence of which alone might be enough to signal a dask collection), but for the others, maybe __dask_keys__, __dask_optimize__ and __dask_finalize__ would be appropriate names?