da.store for hdf file or to_hdf fails with distributed scheduler

h5py File objects do not serialize, so they can only be shared between tasks with the threaded scheduler. da.to_hdf creates an h5py File and then calls store. 
Naturally, this only makes sense if the workers are on the same machine or otherwise have access to the same network file-system. Producing many hdf files from one dataset does not necessarily make much sense for arrays as opposed to dataframes.

dataframe's to_hdf is already fixed in this respect, so presumably some of that logic could be copied here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

da.store for hdf file or to_hdf fails with distributed scheduler #2488

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

da.store for hdf file or to_hdf fails with distributed scheduler #2488

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions