Skip to content

Conversation

@dcherian
Copy link
Contributor

@dcherian dcherian commented Nov 21, 2024

@dcherian dcherian marked this pull request as draft November 21, 2024 18:47
@dcherian dcherian added the topic-chunked-arrays Managing different chunked backends, e.g. dask label Nov 21, 2024
@dcherian dcherian force-pushed the cache-chunking-backend-dataset branch from 8e80fb9 to e2aad24 Compare November 21, 2024 18:59
@dcherian dcherian marked this pull request as ready for review November 21, 2024 19:07
@dcherian dcherian requested a review from max-sixty November 21, 2024 19:34
Copy link
Collaborator

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't re-think through the logic but the cache idea makes sense, thanks!

@dcherian dcherian added the plan to merge Final call for comments label Nov 21, 2024
@dcherian
Copy link
Contributor Author

I didn't re-think through the logic but the cache idea makes sense, thanks!

The core of it didn't change, just avoided materializing a huge iterable in memory in favor of finding the first disagreement.

@dcherian dcherian merged commit 1c88f1e into pydata:main Nov 22, 2024
33 checks passed
@dcherian dcherian deleted the cache-chunking-backend-dataset branch March 18, 2025 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

plan to merge Final call for comments topic-chunked-arrays Managing different chunked backends, e.g. dask

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opening a zarr dataset taking so much time with dask

2 participants