Skip to content

Implement DataArray.to_dask_dataframe() #7409

@gcaria

Description

@gcaria

Is your feature request related to a problem?

It'd be nice to pass from a chunked DataArray to a dask object directly

Describe the solution you'd like

I think something along these lines should work (although a less convoluted way might exist):

import dask.dataframe as dkd
import xarray as xr

def to_dask(da: xr.DataArray) -> Union[dkd.Series, dkd.DataFrame]:

    if da.data.ndim > 2:
        raise ValueError(f"Can only convert 1D and 2D DataArrays, found {da.data.ndim} dimensions")

    indexes = [da.get_index(dim) for dim in da.dims]
    darr_index = dka.from_array(indexes[0], chunks=da.data.chunks[0])
    columns = [da.name] if da.data.ndim == 1 else indexes[1]
    ddf = dkd.from_dask_array(da.data, columns=columns)
    ddf[indexes[0].name] = darr_index
    return ddf.set_index(indexes[0].name).squeeze()

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions