Skip to content

Conversation

@alxtkr77
Copy link
Member

@alxtkr77 alxtkr77 commented Dec 18, 2025

Summary

  • Fix bug in ResourceCache.get_table() where parse_path(uri) overwrote the uri variable
  • Tables were being stored under the parsed path instead of the original URI, causing cache misses
  • This led to duplicate Table objects being created on subsequent calls with the same URI

Root Cause

In store_resources.py, the code was:

endpoint, uri = parse_path(uri)  # uri gets overwritten!
self._tabels[uri] = Table(uri, ...)  # stored under wrong key

The fix preserves the original URI for the cache key:

endpoint, path = parse_path(uri)  # uri preserved
self._tabels[uri] = Table(path, ...)  # stored under correct key

Related Issue

ML-11518

Test Plan

  • Existing unit tests pass
  • Verified fix addresses table caching behavior during model monitoring drain cycles

@alxtkr77 alxtkr77 requested a review from a team as a code owner December 18, 2025 12:16
Copy link
Collaborator

@gtopper gtopper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Would be good to add a regression unit test.

Comment on lines 403 to 406
# Act - First call creates the table
first_result = cache.get_table(test_uri)

# Assert - Table created and cached under original URI (not parsed path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are pretty unnecessarily verbose. We know the test will set things up, do something, and make an assertion...

@alxtkr77 alxtkr77 force-pushed the ML-11518-fix-table-cache-key branch from e53b00b to 767f27b Compare December 21, 2025 10:55
Copy link
Collaborator

@gtopper gtopper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise ✅

Comment on lines 389 to 391
Regression test for ML-11518: parse_path() was overwriting the uri variable,
causing tables to be stored under the parsed path instead of the original URI.
This led to cache misses on subsequent calls with the same URI.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need an in-depth recap here.

Alex Toker added 2 commits December 21, 2025 12:38
Fix bug where parse_path() overwrote the uri variable, causing tables
to be stored under the parsed path instead of the original URI. This
caused cache misses on subsequent get_table() calls with the same URI,
leading to duplicate Table objects being created.

The fix preserves the original URI for the cache key while using the
parsed path for the Table constructor, ensuring proper cache hits.
Test verifies that ResourceCache.get_table() caches tables under the
original URI, not the parsed path. This prevents cache misses and
duplicate Table objects on subsequent calls with the same URI.
@alxtkr77 alxtkr77 force-pushed the ML-11518-fix-table-cache-key branch from 767f27b to 808169d Compare December 21, 2025 12:39
Refactor test based on reviewer feedback:
- Move imports to file top instead of inline in function
- Remove verbose AAA comments (Arrange/Act/Assert)
- Remove internal cache assertion (_tabels check) in favor of testing
  observable behavior (Table not recreated on second call)

Reference: ML-11518
@alxtkr77 alxtkr77 force-pushed the ML-11518-fix-table-cache-key branch from 808169d to 4793685 Compare December 21, 2025 13:14
@gtopper gtopper merged commit cb8733f into mlrun:development Dec 21, 2025
13 checks passed
alxtkr77 added a commit to alxtkr77/mlrun that referenced this pull request Dec 21, 2025
## Summary
- Fix bug in `ResourceCache.get_table()` where `parse_path(uri)`
overwrote the `uri` variable
- Tables were being stored under the parsed path instead of the original
URI, causing cache misses
- This led to duplicate Table objects being created on subsequent calls
with the same URI

## Root Cause
In `store_resources.py`, the code was:
```python
endpoint, uri = parse_path(uri)  # uri gets overwritten!
self._tabels[uri] = Table(uri, ...)  # stored under wrong key
```

The fix preserves the original URI for the cache key:
```python
endpoint, path = parse_path(uri)  # uri preserved
self._tabels[uri] = Table(path, ...)  # stored under correct key
```

## Related Issue
[ML-11518](https://jira.iguazeng.com/browse/ML-11518)

## Test Plan
- Existing unit tests pass
- Verified fix addresses table caching behavior during model monitoring
drain cycles

[ML-11518]:
https://iguazio.atlassian.net/browse/ML-11518?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: Alex Toker <[email protected]>
alxtkr77 added a commit that referenced this pull request Dec 21, 2025
## Summary
- Fix bug in `ResourceCache.get_table()` where `parse_path(uri)`
overwrote the `uri` variable
- Tables were being stored under the parsed path instead of the original
URI, causing cache misses
- This led to duplicate Table objects being created on subsequent calls
with the same URI

Backport of #9085

Co-authored-by: Alex Toker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants