-
Notifications
You must be signed in to change notification settings - Fork 294
[Datastore] Fix ResourceCache table caching key bug #9085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Datastore] Fix ResourceCache table caching key bug #9085
Conversation
gtopper
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Would be good to add a regression unit test.
tests/datastore/test_datastores.py
Outdated
| # Act - First call creates the table | ||
| first_result = cache.get_table(test_uri) | ||
|
|
||
| # Assert - Table created and cached under original URI (not parsed path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are pretty unnecessarily verbose. We know the test will set things up, do something, and make an assertion...
e53b00b to
767f27b
Compare
gtopper
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise ✅
tests/datastore/test_datastores.py
Outdated
| Regression test for ML-11518: parse_path() was overwriting the uri variable, | ||
| causing tables to be stored under the parsed path instead of the original URI. | ||
| This led to cache misses on subsequent calls with the same URI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need an in-depth recap here.
Fix bug where parse_path() overwrote the uri variable, causing tables to be stored under the parsed path instead of the original URI. This caused cache misses on subsequent get_table() calls with the same URI, leading to duplicate Table objects being created. The fix preserves the original URI for the cache key while using the parsed path for the Table constructor, ensuring proper cache hits.
Test verifies that ResourceCache.get_table() caches tables under the original URI, not the parsed path. This prevents cache misses and duplicate Table objects on subsequent calls with the same URI.
767f27b to
808169d
Compare
Refactor test based on reviewer feedback: - Move imports to file top instead of inline in function - Remove verbose AAA comments (Arrange/Act/Assert) - Remove internal cache assertion (_tabels check) in favor of testing observable behavior (Table not recreated on second call) Reference: ML-11518
808169d to
4793685
Compare
## Summary - Fix bug in `ResourceCache.get_table()` where `parse_path(uri)` overwrote the `uri` variable - Tables were being stored under the parsed path instead of the original URI, causing cache misses - This led to duplicate Table objects being created on subsequent calls with the same URI ## Root Cause In `store_resources.py`, the code was: ```python endpoint, uri = parse_path(uri) # uri gets overwritten! self._tabels[uri] = Table(uri, ...) # stored under wrong key ``` The fix preserves the original URI for the cache key: ```python endpoint, path = parse_path(uri) # uri preserved self._tabels[uri] = Table(path, ...) # stored under correct key ``` ## Related Issue [ML-11518](https://jira.iguazeng.com/browse/ML-11518) ## Test Plan - Existing unit tests pass - Verified fix addresses table caching behavior during model monitoring drain cycles [ML-11518]: https://iguazio.atlassian.net/browse/ML-11518?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ --------- Co-authored-by: Alex Toker <[email protected]>
## Summary - Fix bug in `ResourceCache.get_table()` where `parse_path(uri)` overwrote the `uri` variable - Tables were being stored under the parsed path instead of the original URI, causing cache misses - This led to duplicate Table objects being created on subsequent calls with the same URI Backport of #9085 Co-authored-by: Alex Toker <[email protected]>
Summary
ResourceCache.get_table()whereparse_path(uri)overwrote theurivariableRoot Cause
In
store_resources.py, the code was:The fix preserves the original URI for the cache key:
Related Issue
ML-11518
Test Plan