Add on_release to st.cache_resource.#13439
Conversation
Replace the TTLCache in ResourceCache with a TTLCleanupCache. Add on_release to the st.cache_resource API, and plumb it through to the TTLCleanupCache. Update st.cache_resource.clear_all to clear the cache directly instead of just GCing it to ensure release functions are called. Implements feature request in #8764 .
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
✅ PR preview is ready!
|
There was a problem hiding this comment.
Pull request overview
This PR adds an on_release callback parameter to st.cache_resource, allowing users to specify cleanup functions that are automatically called when cached resources are removed from the cache. This implements feature request #8674.
Key Changes
- Added
on_releaseparameter to thest.cache_resourceAPI with comprehensive documentation - Replaced
TTLCachewithTTLCleanupCacheinResourceCacheto support release callbacks - Updated
ResourceCaches.clear_all()to explicitly callclear()on each cache before removing them, ensuring release functions are invoked
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| lib/streamlit/runtime/caching/cache_resource_api.py | Added on_release parameter throughout the caching layers, updated imports to use TTLCleanupCache, modified clear_all() to trigger callbacks, and added comprehensive documentation |
| lib/tests/streamlit/runtime/caching/cache_resource_api_test.py | Added test_on_release_fires() to verify callbacks are invoked on cache eviction and explicit clearing |
📉 Frontend coverage change detectedThe frontend unit test (vitest) coverage has decreased by 0.0000%
✅ Coverage change is within normal range. |
| for cache in self._function_caches.values(): | ||
| cache.clear() |
There was a problem hiding this comment.
nitpick: Maybe add a brief comment here that calling clear explicitly is required to trigger the on-release hooks.
There was a problem hiding this comment.
Done.
This sent me down a rabbit hole, as I realized I hadn't checked the behavior when multiple items are being cleared and one throws an exception. Now, this is handled correctly, and has some test coverage.
| validate: ValidateFunc | None, | ||
| hash_funcs: HashFuncsDict | None = None, | ||
| show_time: bool = False, | ||
| on_release: OnRelease = _no_op_release, |
There was a problem hiding this comment.
nitpick: might be slightly more consistent if we support on_release: OnRelease | None = None here and just pass down the None value from the decorator and use on_release or _no_op_release in the initialization.
There was a problem hiding this comment.
Sure, Done.
| This is not used as a part of the cache key - meaning changes to this | ||
| function between script runs will not trigger a new resource being | ||
| generated. |
There was a problem hiding this comment.
suggestion: This note might be a bit misleading/inconsistent since - afaik - none of the cache configuration parameters are part of the cache key or trigger a cache regeneration. Maybe we can simplify this to just add a top-level note that changing cache configuration parameter will not invalidate the existing cached entries.
There was a problem hiding this comment.
max_entries and ttl are both part of the (function) cache key, and changes to these will invalidate cache entries.
While validate is in that if block, the invoked helper just checks to see if the Nonefulness has changed.
I'm happy to remove this, if you think it's just adding confusion.
There was a problem hiding this comment.
Oh, I didn't know that, but makes sense. I was looking at the cache key generated here which doesn't seem to be impacted:
and the value key:
I think its worth documenting, but maybe better to mention this in the ttl and max_entries docstrings that changing it invalidates the existing cache entries.
There was a problem hiding this comment.
Updated! Also noted that Pandas (and therefore this helper) treats unitless string TTLs as nanoseconds, which has caused confusion in the past.
Small style cleanup.
sfc-gh-jkinkead
left a comment
There was a problem hiding this comment.
Thanks for the quick review!
| for cache in self._function_caches.values(): | ||
| cache.clear() |
There was a problem hiding this comment.
Done.
This sent me down a rabbit hole, as I realized I hadn't checked the behavior when multiple items are being cleared and one throws an exception. Now, this is handled correctly, and has some test coverage.
| validate: ValidateFunc | None, | ||
| hash_funcs: HashFuncsDict | None = None, | ||
| show_time: bool = False, | ||
| on_release: OnRelease = _no_op_release, |
There was a problem hiding this comment.
Sure, Done.
| This is not used as a part of the cache key - meaning changes to this | ||
| function between script runs will not trigger a new resource being | ||
| generated. |
There was a problem hiding this comment.
max_entries and ttl are both part of the (function) cache key, and changes to these will invalidate cache entries.
While validate is in that if block, the invoked helper just checks to see if the Nonefulness has changed.
I'm happy to remove this, if you think it's just adding confusion.
|
@lukasmasuch - I'll likely merge by EOD. Feel free to add more comments if you have them, and I'll address in a follow-up if there are any! |
Fix typo. Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
Add note about default Pandas units on ttl, as this can be tricky.
## Describe your changes
Fixes an issue with Snowflake connections not getting re-initialized
after having been closed.
<details>
<summary>Claude issue analysis</summary>
# Issue Report: Snowflake Connection "Connection is closed" Error
## Summary
After recent PRs adding `on_release` to `st.cache_resource` and
session-scoped connection support, users may encounter a
`snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is
closed` error when using `st.connection("snowflake")` with cached data
queries.
## Related PRs
- **PR #13439**: Add `on_release` to `st.cache_resource`
- **PR #13482**: Add session scoping to caches
- **PR #13538**: Add `SnowflakeCallersRightsConnection`
- **PR #13506**: Add session-scoped connection support
## Error Details
```
Traceback (most recent call last):
File ".../streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
result = func()
...
File ".../snowflake/snowpark/_internal/server_connection.py", line 205, in _cursor
self._thread_store.cursor = self._conn.cursor()
File ".../snowflake/connector/connection.py", line 1270, in cursor
Error.errorhandler_wrapper(...)
snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is closed
```
## Root Cause Analysis
### Background
The PRs mentioned added important functionality:
1. **PR #13439** added the `on_release` callback to `st.cache_resource`,
which is called when cache entries are evicted
2. **PR #13506** modified `connection_factory.py` to use this
`on_release` callback to call `connection.close()` when a connection is
evicted from the cache
This is the relevant code in `connection_factory.py`:
```python
def on_release_wrapped(connection: ConnectionClass) -> None:
connection.close()
__create_connection = cache_resource(
max_entries=max_entries,
show_spinner="Running `st.connection(...)`.",
ttl=ttl,
scope=scope,
on_release=on_release_wrapped, # Calls close() when evicted
)(__create_connection)
```
### The Bug
In `BaseSnowflakeConnection.close()`, after calling
`self._raw_instance.close()`, the `_raw_instance` attribute was **NOT**
reset to `None`:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
# BUG: _raw_instance was NOT set to None!
```
This caused the following issue:
1. When `close()` was called (e.g., via `on_release` when a cache entry
is evicted), the underlying connection was closed
2. However, `_raw_instance` still referenced the **closed** connection
object
3. The `_instance` property checks `if self._raw_instance is None` to
decide whether to create a new connection:
```python
@Property
def _instance(self) -> RawConnectionT:
if self._raw_instance is None:
self._raw_instance = self._connect(**self._kwargs)
return self._raw_instance
```
4. Since `_raw_instance` wasn't `None`, subsequent access to `_instance`
returned the **CLOSED** connection
5. Any operations on the closed connection failed with "Connection is
closed"
### When This Bug Manifests
The `on_release` callback (which calls `close()`) is triggered when:
- Cache entries expire due to TTL
- Cache is full and oldest entries are evicted (`max_entries`)
- `st.cache_resource.clear()` is called
- For session-scoped caches: when a session disconnects
For global-scoped connections like `st.connection("snowflake")`, this
typically only happens if:
- `st.cache_resource.clear()` is called explicitly
- TTL is set and expires
- `max_entries` is set and exceeded
### Additional Consideration: Snowpark Sessions
When users call `conn.session()`, they get a Snowpark Session that
internally references `self._instance`. If the underlying connection is
closed:
```python
def session(self) -> Session:
if running_in_sis():
return get_active_session()
return Session.builder.configs({"connection": self._instance}).create()
```
Any Snowpark Sessions created from the connection will also fail because
they hold a reference to the now-closed underlying connection object.
## Fix
The fix is simple: reset `_raw_instance` to `None` after closing the
connection:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
self._raw_instance = None # Added this line
```
This ensures that after `close()` is called, the next access to
`_instance` will create a new connection instead of returning the closed
one.
## Files Changed
1. **`lib/streamlit/connections/snowflake_connection.py`**
- Fixed `close()` method to reset `_raw_instance = None` after closing
2. **`lib/tests/streamlit/connections/snowflake_connection_test.py`**
- Added `TestSnowflakeConnectionClose` test class with:
- `test_close_resets_raw_instance`: Verifies that `close()` closes the
connection AND resets `_raw_instance`
- `test_close_is_noop_when_not_connected`: Verifies that `close()`
doesn't fail when `_raw_instance` is already `None`
## Testing
```bash
PYTHONPATH=lib pytest lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose -v
```
Output:
```
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_resets_raw_instance PASSED
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_is_noop_when_not_connected PASSED
```
## Recommendations for Users
Until this fix is released, users experiencing this issue can:
1. **Avoid storing Snowpark Sessions long-term**: Instead of caching
Snowpark Sessions, create them fresh when needed
2. **Check if using `st.cache_resource.clear()`**: If calling this
anywhere in the app, it will close all cached connections
3. **Consider connection TTL settings**: If TTL is set on the
connection, it may expire and close
## Impact
- **Affected**: Users of `st.connection("snowflake")` and
`st.connection("snowflake-callers-rights")` who experience cache
eviction scenarios
- **Severity**: Medium - The bug causes operations to fail with a
confusing error message, but the workaround (restarting the app or
avoiding cache clears) is available
- **Scope**: Only affects `SnowflakeConnection` and its subclasses;
other connection types (`SQLConnection`, `SnowparkConnection`) inherit
the no-op `close()` from `BaseConnection` and are not affected
</details>
## GitHub Issue Link (if applicable)
## Testing Plan
- Added unit test.
---
**Contribution License Agreement**
By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
## Describe your changes
Fixes an issue with Snowflake connections not getting re-initialized
after having been closed.
<details>
<summary>Claude issue analysis</summary>
# Issue Report: Snowflake Connection "Connection is closed" Error
## Summary
After recent PRs adding `on_release` to `st.cache_resource` and
session-scoped connection support, users may encounter a
`snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is
closed` error when using `st.connection("snowflake")` with cached data
queries.
## Related PRs
- **PR #13439**: Add `on_release` to `st.cache_resource`
- **PR #13482**: Add session scoping to caches
- **PR #13538**: Add `SnowflakeCallersRightsConnection`
- **PR #13506**: Add session-scoped connection support
## Error Details
```
Traceback (most recent call last):
File ".../streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
result = func()
...
File ".../snowflake/snowpark/_internal/server_connection.py", line 205, in _cursor
self._thread_store.cursor = self._conn.cursor()
File ".../snowflake/connector/connection.py", line 1270, in cursor
Error.errorhandler_wrapper(...)
snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is closed
```
## Root Cause Analysis
### Background
The PRs mentioned added important functionality:
1. **PR #13439** added the `on_release` callback to `st.cache_resource`,
which is called when cache entries are evicted
2. **PR #13506** modified `connection_factory.py` to use this
`on_release` callback to call `connection.close()` when a connection is
evicted from the cache
This is the relevant code in `connection_factory.py`:
```python
def on_release_wrapped(connection: ConnectionClass) -> None:
connection.close()
__create_connection = cache_resource(
max_entries=max_entries,
show_spinner="Running `st.connection(...)`.",
ttl=ttl,
scope=scope,
on_release=on_release_wrapped, # Calls close() when evicted
)(__create_connection)
```
### The Bug
In `BaseSnowflakeConnection.close()`, after calling
`self._raw_instance.close()`, the `_raw_instance` attribute was **NOT**
reset to `None`:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
# BUG: _raw_instance was NOT set to None!
```
This caused the following issue:
1. When `close()` was called (e.g., via `on_release` when a cache entry
is evicted), the underlying connection was closed
2. However, `_raw_instance` still referenced the **closed** connection
object
3. The `_instance` property checks `if self._raw_instance is None` to
decide whether to create a new connection:
```python
@Property
def _instance(self) -> RawConnectionT:
if self._raw_instance is None:
self._raw_instance = self._connect(**self._kwargs)
return self._raw_instance
```
4. Since `_raw_instance` wasn't `None`, subsequent access to `_instance`
returned the **CLOSED** connection
5. Any operations on the closed connection failed with "Connection is
closed"
### When This Bug Manifests
The `on_release` callback (which calls `close()`) is triggered when:
- Cache entries expire due to TTL
- Cache is full and oldest entries are evicted (`max_entries`)
- `st.cache_resource.clear()` is called
- For session-scoped caches: when a session disconnects
For global-scoped connections like `st.connection("snowflake")`, this
typically only happens if:
- `st.cache_resource.clear()` is called explicitly
- TTL is set and expires
- `max_entries` is set and exceeded
### Additional Consideration: Snowpark Sessions
When users call `conn.session()`, they get a Snowpark Session that
internally references `self._instance`. If the underlying connection is
closed:
```python
def session(self) -> Session:
if running_in_sis():
return get_active_session()
return Session.builder.configs({"connection": self._instance}).create()
```
Any Snowpark Sessions created from the connection will also fail because
they hold a reference to the now-closed underlying connection object.
## Fix
The fix is simple: reset `_raw_instance` to `None` after closing the
connection:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
self._raw_instance = None # Added this line
```
This ensures that after `close()` is called, the next access to
`_instance` will create a new connection instead of returning the closed
one.
## Files Changed
1. **`lib/streamlit/connections/snowflake_connection.py`**
- Fixed `close()` method to reset `_raw_instance = None` after closing
2. **`lib/tests/streamlit/connections/snowflake_connection_test.py`**
- Added `TestSnowflakeConnectionClose` test class with:
- `test_close_resets_raw_instance`: Verifies that `close()` closes the
connection AND resets `_raw_instance`
- `test_close_is_noop_when_not_connected`: Verifies that `close()`
doesn't fail when `_raw_instance` is already `None`
## Testing
```bash
PYTHONPATH=lib pytest lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose -v
```
Output:
```
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_resets_raw_instance PASSED
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_is_noop_when_not_connected PASSED
```
## Recommendations for Users
Until this fix is released, users experiencing this issue can:
1. **Avoid storing Snowpark Sessions long-term**: Instead of caching
Snowpark Sessions, create them fresh when needed
2. **Check if using `st.cache_resource.clear()`**: If calling this
anywhere in the app, it will close all cached connections
3. **Consider connection TTL settings**: If TTL is set on the
connection, it may expire and close
## Impact
- **Affected**: Users of `st.connection("snowflake")` and
`st.connection("snowflake-callers-rights")` who experience cache
eviction scenarios
- **Severity**: Medium - The bug causes operations to fail with a
confusing error message, but the workaround (restarting the app or
avoiding cache clears) is available
- **Scope**: Only affects `SnowflakeConnection` and its subclasses;
other connection types (`SQLConnection`, `SnowparkConnection`) inherit
the no-op `close()` from `BaseConnection` and are not affected
</details>
## GitHub Issue Link (if applicable)
## Testing Plan
- Added unit test.
---
**Contribution License Agreement**
By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
Describe your changes
Replace the TTLCache in ResourceCache with a TTLCleanupCache.
Add on_release to the st.cache_resource API, and plumb it through to the TTLCleanupCache.
Update st.cache_resource.clear_all to clear the cache directly instead of just GCing it to ensure release functions are called.
Implements feature request in #8674 .
GitHub Issue Link (if applicable)
Testing Plan
Implemented.
Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.