Skip to content

Conversation

@TomAugspurger
Copy link
Contributor

@TomAugspurger TomAugspurger commented Oct 8, 2024

Closes #2309

I'm not happy with this implementation. I broke / fixed this in my consolidated metadata PR as well.

Relying on the exception type from open_array to decide whether to fall back to opening a group seems pretty fragile. I might spend a few minutes trying out an alternative way of doing things that'll be a bit more robust (potentially at the cost of some unnecessary I/O; but I think that's OK for open since open_group and open_array offer more direct ways of doing things).

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

except KeyError:
except (KeyError, ValueError):
# KeyError for a missing key
# ValueError for failing to parse node metadata as an array when it's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should create a specific exception for this case, since "valid document but the wrong node type" is different from "invalid document"

@TomAugspurger TomAugspurger marked this pull request as ready for review October 8, 2024 15:23
@TomAugspurger
Copy link
Contributor Author

I might spend a few minutes trying out an alternative way of doing things that'll be a bit more robust

https://github.com/TomAugspurger/zarr-python/blob/tom/fix/batched-io/src/zarr/core/metadata/_io.py has the start of that. I like the idea of using match statements and an explicit list of files to read. There's still a decent chunk of work to do (figure out which cases we know we files to read and error if any are missing, then discover the actual zarr_format and node_type based on what we read and parse into the correct objects, and finally updating open and hopefully open_array and open_group to use that or something similar). I don't think I'll pursue this now.

@TomAugspurger TomAugspurger added the downstream Downstream libraries using zarr label Oct 9, 2024
Copy link
Member

@jhamman jhamman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One test suggestion but otherwise this looks good to me.

@jhamman jhamman added this to the 3.0.0.beta milestone Oct 9, 2024
@jhamman jhamman added the V3 label Oct 9, 2024
@jhamman jhamman mentioned this pull request Oct 9, 2024
6 tasks
@jhamman jhamman merged commit 395604d into zarr-developers:v3 Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

downstream Downstream libraries using zarr

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Zarr v3.0.0a7 fail to open group dataset

3 participants