Faster chunk checking for backend datasets #9808

dcherian · 2024-11-21T18:33:28Z

Closes opening a zarr dataset taking so much time with dask #8902 (shaves 30s off the runtime, dask is responsible for the rest)
User visible changes (including notable bug fixes) are documented in whats-new.rst

max-sixty

I didn't re-think through the logic but the cache idea makes sense, thanks!

dcherian · 2024-11-21T22:21:37Z

I didn't re-think through the logic but the cache idea makes sense, thanks!

The core of it didn't change, just avoided materializing a huge iterable in memory in favor of finding the first disagreement.

Faster chunk checking for backend datasets

96b621a

dcherian mentioned this pull request Nov 21, 2024

opening a zarr dataset taking so much time with dask #8902

Open

limit size

2482fc4

dcherian marked this pull request as draft November 21, 2024 18:47

dcherian added the topic-chunked-arrays Managing different chunked backends, e.g. dask label Nov 21, 2024

dcherian added 2 commits November 21, 2024 11:56

fix test

5e07bbb

optimize

e2aad24

dcherian force-pushed the cache-chunking-backend-dataset branch from 8e80fb9 to e2aad24 Compare November 21, 2024 18:59

dcherian marked this pull request as ready for review November 21, 2024 19:07

dcherian requested a review from max-sixty November 21, 2024 19:34

max-sixty approved these changes Nov 21, 2024

View reviewed changes

dcherian added the plan to merge Final call for comments label Nov 21, 2024

dcherian merged commit 1c88f1e into pydata:main Nov 22, 2024
33 checks passed

dcherian deleted the cache-chunking-backend-dataset branch March 18, 2025 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Faster chunk checking for backend datasets #9808

Faster chunk checking for backend datasets #9808

Uh oh!

dcherian commented Nov 21, 2024 •

edited

Loading

Uh oh!

max-sixty left a comment

Uh oh!

dcherian commented Nov 21, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Faster chunk checking for backend datasets #9808

Faster chunk checking for backend datasets #9808

Uh oh!

Conversation

dcherian commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

max-sixty left a comment

Choose a reason for hiding this comment

Uh oh!

dcherian commented Nov 21, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dcherian commented Nov 21, 2024 •

edited

Loading