Skip to content

[python] check encoding version in all open code paths#4055

Merged
bkmartinjr merged 4 commits intomainfrom
bkm/soma-encoding-version-check
May 14, 2025
Merged

[python] check encoding version in all open code paths#4055
bkmartinjr merged 4 commits intomainfrom
bkm/soma-encoding-version-check

Conversation

@bkmartinjr
Copy link
Copy Markdown
Member

@bkmartinjr bkmartinjr commented May 13, 2025

Fixes SOMA-54

The tiledbsoma.open code path checks encoding version and generates a nice error if the object is unsupported. All other class-specific open paths failed to perform this check. This can lead to the situation where the user is able to open SOMA objects with an unsupported version, which will then fail in surprising ways elsewhere.

Changes:

  • added a consistent check to all open code paths - now any open will do the same checks and generate the same errors (as needed)
  • updated tests to verify

@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2025

Codecov Report

Attention: Patch coverage is 71.42857% with 6 lines in your changes missing coverage. Please review.

Project coverage is 89.28%. Comparing base (da5bdc9) to head (edc9cc8).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4055      +/-   ##
==========================================
+ Coverage   89.19%   89.28%   +0.08%     
==========================================
  Files          57       59       +2     
  Lines        6903     7063     +160     
==========================================
+ Hits         6157     6306     +149     
- Misses        746      757      +11     
Flag Coverage Δ
python 89.28% <71.42%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 89.28% <71.42%> (+0.08%) ⬆️
libtiledbsoma ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bkmartinjr bkmartinjr changed the title [python] check encoding version and other metadata in all open code paths [python] check encoding version in all open code paths May 13, 2025
@bkmartinjr bkmartinjr marked this pull request as ready for review May 13, 2025 23:09
)
if _read_soma_type(handle) != cls.soma_type:
raise SOMAError(
"Unexpected SOMA metadaa encoding - unable to determine object type."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right error. Unless I'm misunderstanding this code, it looks like we were able to determine the type - it just wasn't the type we were expecting.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct - it is either a sign that we have a logic bug, OR the data was corrupted in some weird way.

I'll update the error message.

Copy link
Copy Markdown
Collaborator

@jp-dark jp-dark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some change requests around the error messages. I'm open to other wording than what I provided here, but I would like to make the errors more helpful if possible.


if obj_type is None:
raise SOMAError(
f"stored TileDB object does not have {SOMA_OBJECT_TYPE_METADATA_KEY!r}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"stored TileDB object does not have {SOMA_OBJECT_TYPE_METADATA_KEY!r}"
f"Cannot access stored TileDB object with TileDB-SOMA. The object is missing "
f"the required '{SOMA_OBJECT_TYPE_METADATA_KEY!r}' metadata key."

Comment on lines +363 to +364
f"stored TileDB object {SOMA_OBJECT_TYPE_METADATA_KEY!r}"
f" is {type(obj_type)}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"stored TileDB object {SOMA_OBJECT_TYPE_METADATA_KEY!r}"
f" is {type(obj_type)}"
f"Cannot access stored TileDB object with TileDB-SOMA. The metadata key "
f"'{SOMA_OBJECT_TYPE_METADATA_KEY!r}' has unexpected type {type(obj_type)}."

)
if encoding_version is None:
raise SOMAError(
f"stored TileDB object does not have {SOMA_ENCODING_VERSION_METADATA_KEY}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"stored TileDB object does not have {SOMA_ENCODING_VERSION_METADATA_KEY}"
f"Cannot access stored TileDB object with TileDB-SOMA. The object is missing "
f" the required '{SOMA_ENCODING_VERSION_METADATA_KEY}' metadata key."

Comment on lines +376 to +377
f"Unsupported SOMA object encoding version {encoding_version}. The TileDB-SOMA "
f"client library needs to be updated to a more recent version."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"Unsupported SOMA object encoding version {encoding_version}. The TileDB-SOMA "
f"client library needs to be updated to a more recent version."
f"Unsupported SOMA object encoding version {encoding_version}. TileDB-SOMA "
f"needs to be updated to a more recent version."

This one was mine. In retrospect I think specifying "client library" is more likely to be confusing than helpful.

@bkmartinjr
Copy link
Copy Markdown
Member Author

@jp-dark - I incorporated your message changes (almost literally, with a couple of minor punctuation changes). Also tried to clarify the "unexpected" type error. I debated whether this should be an assertion or error, as it isn't something the user can handle, but most of these "bad metadata" checks are similar, so ended up leaving it as is (with clarified message)

@bkmartinjr bkmartinjr requested a review from jp-dark May 14, 2025 20:28
@bkmartinjr bkmartinjr merged commit 6374bae into main May 14, 2025
18 checks passed
@bkmartinjr bkmartinjr deleted the bkm/soma-encoding-version-check branch May 14, 2025 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants