Add SnowflakeCallersRightsConnection.#13538
Conversation
Create a new BaseSnowflakeConnection with all existing methods other than `_connect` overridden. Make SnowflakeConnection a subclass with the existing `_connect` implementation. Add a `close` method to BaseSnowflakeConnection. Create SnowflakeCallersRightsConnection with session-scoping and Snowpark-specific connection logic. Update BaseSnowflakeConnection docs to include both connection types. Register the new SnowflakeCallersRightsConnection as "snowflake-callers-rights".
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
✅ PR preview is ready!
|
There was a problem hiding this comment.
Pull request overview
This PR introduces a new SnowflakeCallersRightsConnection class for Snowflake connections in Snowpark Container Services environments with caller's rights enabled. The implementation creates a base class hierarchy by extracting common functionality into BaseSnowflakeConnection, making SnowflakeConnection a subclass, and adding SnowflakeCallersRightsConnection as a session-scoped variant that uses per-user identity tokens.
Key changes:
- Refactored existing
SnowflakeConnectionfunctionality into a newBaseSnowflakeConnectionbase class - Added
SnowflakeCallersRightsConnectionfor session-scoped caller's rights authentication in Snowpark Container Services - Added
close()method to properly clean up Snowflake connections - Registered the new connection type as "snowflake-callers-rights" in the connection factory
- Added comprehensive unit tests for the new connection's parameter generation logic
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
lib/streamlit/connections/snowflake_connection.py |
Major refactoring: created BaseSnowflakeConnection base class with all common methods, moved original _connect to SnowflakeConnection subclass, added new SnowflakeCallersRightsConnection with session scoping and Snowpark-specific authentication, added close() method, defined constants for Snowpark tokens |
lib/streamlit/connections/__init__.py |
Added SnowflakeCallersRightsConnection to imports and exports |
lib/streamlit/runtime/connection_factory.py |
Registered "snowflake-callers-rights" connection type in _FIRST_PARTY_CONNECTIONS dict and added two overload signatures for type checking |
lib/tests/streamlit/runtime/connection_factory_test.py |
Added test case for "snowflake-callers-rights" connection type registration |
lib/tests/streamlit/connections/snowflake_connection_test.py |
Added comprehensive test suite (TestSnowflakeCallersRightsConnection) covering environment variable validation, token file validation, header validation, and successful parameter generation |
|
@cursor review |
| params = self._get_connection_params() | ||
|
|
||
| # Connect with the params we generated, overriding with user-specified params. | ||
| return snowflake.connector.connect(**{**params, **kwargs}) |
There was a problem hiding this comment.
Security vulnerability: User-provided **kwargs can override critical security parameters including the token. This defeats the purpose of caller's rights connections by allowing arbitrary token substitution.
In a caller's rights connection, the token should never be overridable as it represents the user's authenticated identity. Currently, code like:
conn = st.connection("snowflake-callers-rights", token="malicious_token")would bypass the caller's rights authentication entirely.
Fix: Either reverse the merge order to **{**kwargs, **params} (so params always win), or explicitly prevent overriding of security-critical parameters like token and authenticator:
conn_params = {**params, **kwargs}
# Ensure critical security params cannot be overridden
conn_params['token'] = params['token']
conn_params['authenticator'] = params['authenticator']
return snowflake.connector.connect(**conn_params)| params = self._get_connection_params() | |
| # Connect with the params we generated, overriding with user-specified params. | |
| return snowflake.connector.connect(**{**params, **kwargs}) | |
| params = self._get_connection_params() | |
| # Connect with the params we generated, overriding with user-specified params | |
| # but ensuring critical security parameters cannot be overridden | |
| conn_params = {**params, **kwargs} | |
| # Ensure critical security params cannot be overridden | |
| if 'token' in params: | |
| conn_params['token'] = params['token'] | |
| if 'authenticator' in params: | |
| conn_params['authenticator'] = params['authenticator'] | |
| return snowflake.connector.connect(**conn_params) |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
This is by design. Allowing a token override gives users more flexibility, especially for testing in non-Snowflake environments.
Additionally, this isn't a security vulnerability. If a Streamlit app author can provide a valid login token, they get to log in with that token; else, they can't log in. Any "vulnerability" would be with the underlying Python connector - which the app author also has full access to.
SummaryThis PR adds a new
This implementation aligns with the approved product spec at Code QualityStrengths:
Minor observations:
Test CoverageGood coverage provided:
Suggestions for additional coverage:
def test_scope_returns_session(self):
"""Tests that the scope is session."""
assert SnowflakeCallersRightsConnection.scope() == "session"
def test_close_closes_raw_instance(self):
"""Tests that close() closes the underlying connection."""
# Test that close() calls _raw_instance.close()
Test file follows best practices:
Backwards CompatibilityFully backwards compatible:
Security & RiskSecurity considerations are well-handled:
Risk assessment:
Recommendations
def test_get_connection_params_missing_file(self):
"""Tests that a missing token file produces an error."""
with (
patch.object(os, "getenv", return_value="exists"),
patch.object(os.path, "exists", return_value=False),
):
with pytest.raises(StreamlitAPIException, match=r"Token file.*not found"):
SnowflakeCallersRightsConnection._get_connection_params()None of these recommendations are blockers. VerdictAPPROVED: This PR is well-implemented, follows Streamlit's coding patterns, maintains backwards compatibility, and includes good test coverage. The new This is an automated AI review. Please verify the feedback and use your judgment. |
| class SnowflakeConnection(BaseSnowflakeConnection): | ||
| """A connection to Snowflake using the Snowflake Connector for Python. | ||
|
|
||
| See ``BaseSnowflakeConnection`` for complete docs. |
There was a problem hiding this comment.
We might need to have the full documentation on this class so that it work properly on our docs page. But could be a follow-up. cc @sfc-gh-dmatthews for more details on how best to do that.
There was a problem hiding this comment.
SGTM. I assumed @sfc-gh-dmatthews would help me figure this out.
| This will only work when running on Snowpark Container Services or another | ||
| compatible platform. |
There was a problem hiding this comment.
question: is it possible to somehow use this from outside Snowpark Container Services during local development? If not, is there any way to develop locally and deploy to Snowpark Container Services using this connection type?
There was a problem hiding this comment.
Not that I'm aware of, unfortunately. The whole notion of "caller's rights" relates to things currently running on the Snowflake platform - and AFAIK Snowflake doesn't support faked callers rights from a local connection (e.g. running a connection using a user's personal credentials, but treating permissions as if it were in restricted caller's rights and enforcing caller grants instead of vanilla grants).
For local verification, users will simply have to use their personal tokens with caller's rights turned off via an environment variable or similar, then deploy a test app to fully verify. FWIW I think this is how it works for Snowflake user-defined functions (AKA stored procedures) and other Snowpark container apps.
I could see us implementing something like a fallback_to_standard parameter to the connection params which would (as advertised) fall back to a standard Snowflake connection when environment variables or the token file were missing - but having that be the default would likely mask problems for non-advanced users.
There was a problem hiding this comment.
For local verification, users will simply have to use their personal tokens with caller's rights turned off via an environment variable or similar, then deploy a test app to fully verify.
Does this allow the user to do local verification without code changes to the app?
I could see us implementing something like a fallback_to_standard parameter to the connection params which would (as advertised) fall back to a standard Snowflake connection when environment variables or the token file were missing - but having that be the default would likely mask problems for non-advanced users.
Another option might be to detect this scenario and show an exception with a descriptive message guiding the user on what to do / how to verify this locally. Overall, any friction during the development process is likely to impact the adoption of the feature.
But this can be a follow-up, maybe good to check back with some (potential) users of this feature once it's released, if they would want a fallback for this, or if it's fine as is for their development.
There was a problem hiding this comment.
Does this allow the user to do local verification without code changes to the app?
Yes, if they use a local-only config or environment variable to toggle connection type. It shouldn't be transparent, since transparency-by-default is dangerous and will be perceived as buggy.
I did confirm that there is no way to run with caller's rights outside of the Snowflake ecosystem - so while a user can write an app and debug it locally, they won't actually be able to test caller's rights without deploying to a Snowflake account (the app may work locally but not after it's deployed). Note that this is a limitation of Snowflake, not Streamlit, so it's not a constraint we can really work around.
lukasmasuch
left a comment
There was a problem hiding this comment.
LGTM 👍 the docs tweaks can be done as a follow-up
## Describe your changes
Fixes an issue with Snowflake connections not getting re-initialized
after having been closed.
<details>
<summary>Claude issue analysis</summary>
# Issue Report: Snowflake Connection "Connection is closed" Error
## Summary
After recent PRs adding `on_release` to `st.cache_resource` and
session-scoped connection support, users may encounter a
`snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is
closed` error when using `st.connection("snowflake")` with cached data
queries.
## Related PRs
- **PR #13439**: Add `on_release` to `st.cache_resource`
- **PR #13482**: Add session scoping to caches
- **PR #13538**: Add `SnowflakeCallersRightsConnection`
- **PR #13506**: Add session-scoped connection support
## Error Details
```
Traceback (most recent call last):
File ".../streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
result = func()
...
File ".../snowflake/snowpark/_internal/server_connection.py", line 205, in _cursor
self._thread_store.cursor = self._conn.cursor()
File ".../snowflake/connector/connection.py", line 1270, in cursor
Error.errorhandler_wrapper(...)
snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is closed
```
## Root Cause Analysis
### Background
The PRs mentioned added important functionality:
1. **PR #13439** added the `on_release` callback to `st.cache_resource`,
which is called when cache entries are evicted
2. **PR #13506** modified `connection_factory.py` to use this
`on_release` callback to call `connection.close()` when a connection is
evicted from the cache
This is the relevant code in `connection_factory.py`:
```python
def on_release_wrapped(connection: ConnectionClass) -> None:
connection.close()
__create_connection = cache_resource(
max_entries=max_entries,
show_spinner="Running `st.connection(...)`.",
ttl=ttl,
scope=scope,
on_release=on_release_wrapped, # Calls close() when evicted
)(__create_connection)
```
### The Bug
In `BaseSnowflakeConnection.close()`, after calling
`self._raw_instance.close()`, the `_raw_instance` attribute was **NOT**
reset to `None`:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
# BUG: _raw_instance was NOT set to None!
```
This caused the following issue:
1. When `close()` was called (e.g., via `on_release` when a cache entry
is evicted), the underlying connection was closed
2. However, `_raw_instance` still referenced the **closed** connection
object
3. The `_instance` property checks `if self._raw_instance is None` to
decide whether to create a new connection:
```python
@Property
def _instance(self) -> RawConnectionT:
if self._raw_instance is None:
self._raw_instance = self._connect(**self._kwargs)
return self._raw_instance
```
4. Since `_raw_instance` wasn't `None`, subsequent access to `_instance`
returned the **CLOSED** connection
5. Any operations on the closed connection failed with "Connection is
closed"
### When This Bug Manifests
The `on_release` callback (which calls `close()`) is triggered when:
- Cache entries expire due to TTL
- Cache is full and oldest entries are evicted (`max_entries`)
- `st.cache_resource.clear()` is called
- For session-scoped caches: when a session disconnects
For global-scoped connections like `st.connection("snowflake")`, this
typically only happens if:
- `st.cache_resource.clear()` is called explicitly
- TTL is set and expires
- `max_entries` is set and exceeded
### Additional Consideration: Snowpark Sessions
When users call `conn.session()`, they get a Snowpark Session that
internally references `self._instance`. If the underlying connection is
closed:
```python
def session(self) -> Session:
if running_in_sis():
return get_active_session()
return Session.builder.configs({"connection": self._instance}).create()
```
Any Snowpark Sessions created from the connection will also fail because
they hold a reference to the now-closed underlying connection object.
## Fix
The fix is simple: reset `_raw_instance` to `None` after closing the
connection:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
self._raw_instance = None # Added this line
```
This ensures that after `close()` is called, the next access to
`_instance` will create a new connection instead of returning the closed
one.
## Files Changed
1. **`lib/streamlit/connections/snowflake_connection.py`**
- Fixed `close()` method to reset `_raw_instance = None` after closing
2. **`lib/tests/streamlit/connections/snowflake_connection_test.py`**
- Added `TestSnowflakeConnectionClose` test class with:
- `test_close_resets_raw_instance`: Verifies that `close()` closes the
connection AND resets `_raw_instance`
- `test_close_is_noop_when_not_connected`: Verifies that `close()`
doesn't fail when `_raw_instance` is already `None`
## Testing
```bash
PYTHONPATH=lib pytest lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose -v
```
Output:
```
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_resets_raw_instance PASSED
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_is_noop_when_not_connected PASSED
```
## Recommendations for Users
Until this fix is released, users experiencing this issue can:
1. **Avoid storing Snowpark Sessions long-term**: Instead of caching
Snowpark Sessions, create them fresh when needed
2. **Check if using `st.cache_resource.clear()`**: If calling this
anywhere in the app, it will close all cached connections
3. **Consider connection TTL settings**: If TTL is set on the
connection, it may expire and close
## Impact
- **Affected**: Users of `st.connection("snowflake")` and
`st.connection("snowflake-callers-rights")` who experience cache
eviction scenarios
- **Severity**: Medium - The bug causes operations to fail with a
confusing error message, but the workaround (restarting the app or
avoiding cache clears) is available
- **Scope**: Only affects `SnowflakeConnection` and its subclasses;
other connection types (`SQLConnection`, `SnowparkConnection`) inherit
the no-op `close()` from `BaseConnection` and are not affected
</details>
## GitHub Issue Link (if applicable)
## Testing Plan
- Added unit test.
---
**Contribution License Agreement**
By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
## Describe your changes
Fixes an issue with Snowflake connections not getting re-initialized
after having been closed.
<details>
<summary>Claude issue analysis</summary>
# Issue Report: Snowflake Connection "Connection is closed" Error
## Summary
After recent PRs adding `on_release` to `st.cache_resource` and
session-scoped connection support, users may encounter a
`snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is
closed` error when using `st.connection("snowflake")` with cached data
queries.
## Related PRs
- **PR #13439**: Add `on_release` to `st.cache_resource`
- **PR #13482**: Add session scoping to caches
- **PR #13538**: Add `SnowflakeCallersRightsConnection`
- **PR #13506**: Add session-scoped connection support
## Error Details
```
Traceback (most recent call last):
File ".../streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
result = func()
...
File ".../snowflake/snowpark/_internal/server_connection.py", line 205, in _cursor
self._thread_store.cursor = self._conn.cursor()
File ".../snowflake/connector/connection.py", line 1270, in cursor
Error.errorhandler_wrapper(...)
snowflake.connector.errors.DatabaseError: 250002 (08003): Connection is closed
```
## Root Cause Analysis
### Background
The PRs mentioned added important functionality:
1. **PR #13439** added the `on_release` callback to `st.cache_resource`,
which is called when cache entries are evicted
2. **PR #13506** modified `connection_factory.py` to use this
`on_release` callback to call `connection.close()` when a connection is
evicted from the cache
This is the relevant code in `connection_factory.py`:
```python
def on_release_wrapped(connection: ConnectionClass) -> None:
connection.close()
__create_connection = cache_resource(
max_entries=max_entries,
show_spinner="Running `st.connection(...)`.",
ttl=ttl,
scope=scope,
on_release=on_release_wrapped, # Calls close() when evicted
)(__create_connection)
```
### The Bug
In `BaseSnowflakeConnection.close()`, after calling
`self._raw_instance.close()`, the `_raw_instance` attribute was **NOT**
reset to `None`:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
# BUG: _raw_instance was NOT set to None!
```
This caused the following issue:
1. When `close()` was called (e.g., via `on_release` when a cache entry
is evicted), the underlying connection was closed
2. However, `_raw_instance` still referenced the **closed** connection
object
3. The `_instance` property checks `if self._raw_instance is None` to
decide whether to create a new connection:
```python
@Property
def _instance(self) -> RawConnectionT:
if self._raw_instance is None:
self._raw_instance = self._connect(**self._kwargs)
return self._raw_instance
```
4. Since `_raw_instance` wasn't `None`, subsequent access to `_instance`
returned the **CLOSED** connection
5. Any operations on the closed connection failed with "Connection is
closed"
### When This Bug Manifests
The `on_release` callback (which calls `close()`) is triggered when:
- Cache entries expire due to TTL
- Cache is full and oldest entries are evicted (`max_entries`)
- `st.cache_resource.clear()` is called
- For session-scoped caches: when a session disconnects
For global-scoped connections like `st.connection("snowflake")`, this
typically only happens if:
- `st.cache_resource.clear()` is called explicitly
- TTL is set and expires
- `max_entries` is set and exceeded
### Additional Consideration: Snowpark Sessions
When users call `conn.session()`, they get a Snowpark Session that
internally references `self._instance`. If the underlying connection is
closed:
```python
def session(self) -> Session:
if running_in_sis():
return get_active_session()
return Session.builder.configs({"connection": self._instance}).create()
```
Any Snowpark Sessions created from the connection will also fail because
they hold a reference to the now-closed underlying connection object.
## Fix
The fix is simple: reset `_raw_instance` to `None` after closing the
connection:
```python
def close(self) -> None:
"""Closes the underlying Snowflake connection."""
if self._raw_instance is not None:
self._raw_instance.close()
self._raw_instance = None # Added this line
```
This ensures that after `close()` is called, the next access to
`_instance` will create a new connection instead of returning the closed
one.
## Files Changed
1. **`lib/streamlit/connections/snowflake_connection.py`**
- Fixed `close()` method to reset `_raw_instance = None` after closing
2. **`lib/tests/streamlit/connections/snowflake_connection_test.py`**
- Added `TestSnowflakeConnectionClose` test class with:
- `test_close_resets_raw_instance`: Verifies that `close()` closes the
connection AND resets `_raw_instance`
- `test_close_is_noop_when_not_connected`: Verifies that `close()`
doesn't fail when `_raw_instance` is already `None`
## Testing
```bash
PYTHONPATH=lib pytest lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose -v
```
Output:
```
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_resets_raw_instance PASSED
lib/tests/streamlit/connections/snowflake_connection_test.py::TestSnowflakeConnectionClose::test_close_is_noop_when_not_connected PASSED
```
## Recommendations for Users
Until this fix is released, users experiencing this issue can:
1. **Avoid storing Snowpark Sessions long-term**: Instead of caching
Snowpark Sessions, create them fresh when needed
2. **Check if using `st.cache_resource.clear()`**: If calling this
anywhere in the app, it will close all cached connections
3. **Consider connection TTL settings**: If TTL is set on the
connection, it may expire and close
## Impact
- **Affected**: Users of `st.connection("snowflake")` and
`st.connection("snowflake-callers-rights")` who experience cache
eviction scenarios
- **Severity**: Medium - The bug causes operations to fail with a
confusing error message, but the workaround (restarting the app or
avoiding cache clears) is available
- **Scope**: Only affects `SnowflakeConnection` and its subclasses;
other connection types (`SQLConnection`, `SnowparkConnection`) inherit
the no-op `close()` from `BaseConnection` and are not affected
</details>
## GitHub Issue Link (if applicable)
## Testing Plan
- Added unit test.
---
**Contribution License Agreement**
By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
Describe your changes
Create a new BaseSnowflakeConnection with all existing methods other than
_connectoverridden. Make SnowflakeConnection a subclass with the existing_connectimplementation.Add a
closemethod to BaseSnowflakeConnection.Create SnowflakeCallersRightsConnection with session-scoping and Snowpark-specific connection logic.
Update BaseSnowflakeConnection docs to include both connection types.
Register the new SnowflakeCallersRightsConnection as "snowflake-callers-rights".
Testing Plan
See unit tests.
Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.