Cache SqlConnection.query results per-instance.#14094
Cache SqlConnection.query results per-instance.#14094sfc-gh-jkinkead merged 3 commits intodevelopfrom
Conversation
Include an instance ID parameter in the cache key for SqlConnection.query's internal cache. This prevents cache leakage between two different database connections running the same query. This may result in some new cache misses. Note however That since `connection_factory` / `st.connection` cache resources globally, this will only result in extra cache misses in cases where users are constructing SqlConnections manually and not through `st.connection`.
✅ PR preview is ready!
|
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
There was a problem hiding this comment.
Pull request overview
Fixes SQLConnection.query() cache leakage across multiple SQLConnection instances that share the same connection name but differ in connection parameters, by ensuring the internal cached helper is keyed per connection instance.
Changes:
- Add an instance-specific parameter to the internal cached
_queryfunction and pass an instance identifier fromSQLConnection.query(). - Update the SQL connection unit test to verify caches are scoped per
SQLConnectioninstance (not just by connection name).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lib/streamlit/connections/sql_connection.py | Adds a per-instance identifier into the cached _query call to prevent cross-instance cache collisions. |
| lib/tests/streamlit/connections/sql_connection_test.py | Adjusts the cache-scoping test to cover multiple instances sharing the same connection name. |
Consolidated Code ReviewSummaryThis PR fixes a cache-leakage bug in Files changed (2):
Code QualityBoth reviewers agree the implementation is clean, minimal, and follows existing patterns in the codebase. The Area of disagreement —
My assessment: I agree with opus-4.6-thinking that this is non-blocking. Here is the reasoning:
Consistency note (agreed by both reviewers): Test CoverageThe updated test
Incremental assertions after each Nice-to-have improvement (agreed by both): Add a final assertion confirming No E2E test is needed — this is an internal caching behavior with no frontend component. Backwards CompatibilityNo API break is introduced. As the PR description notes, manually constructed Security & RiskThis PR closes a security-relevant issue — cache leakage between different database connections could expose data from one database to queries intended for another. The fix correctly isolates cache entries per connection instance. No new security concerns are introduced. AccessibilityNo frontend/UI changes — not applicable. Recommendations
Reviewer Agreement Summary
VerdictAPPROVED — Clean, minimal fix for a real cache-leakage bug with appropriate test coverage. The Consolidated review by 📋 Review by `gpt-5.3-codex-high`SummaryThis PR updates Code QualityThe implementation is small, readable, and follows existing patterns in Issue found:
Test CoverageThe updated unit test ( Coverage gap:
Backwards CompatibilityNo API break is introduced. Behavior changes for users manually constructing Security & RiskMain risk is unresolved cache leakage due to recycled object IDs ( AccessibilityNo frontend/UI changes in this PR, so accessibility impact is not applicable. Recommendations
VerdictCHANGES REQUESTED: The fix direction is good, but using This is an automated AI review by 📋 Review by `opus-4.6-thinking`SummaryThis PR fixes a cache-leakage bug in Changes are limited to two files:
Code QualityThe implementation is clean and minimal. The approach of using a dummy parameter to influence the
One minor concern: Consistency note: Test CoverageThe updated test
The incremental assertions after each The test correctly verifies the core fix. One additional scenario that could strengthen coverage is verifying that the same instance still gets cache hits even after a different instance causes a miss (e.g., calling No E2E test is included. Given this is an internal caching behavior with no frontend component, unit test coverage alone is adequate. Backwards CompatibilityAs the PR description correctly notes, this change may introduce new cache misses for users who manually construct For users of This is an acceptable trade-off: cache misses (safe but slower) are far preferable to cache leakage (returning data from a wrong database connection). Security & RiskThis PR closes a security-relevant issue — cache leakage between different database connections could expose data from one database to queries intended for another. The fix correctly isolates cache entries per connection instance. No new security concerns are introduced. The AccessibilityNo frontend changes — not applicable. Recommendations
VerdictAPPROVED: Clean, minimal fix for a real cache-leakage bug with appropriate test coverage. The approach is sound, backwards compatible for the standard usage path, and consistent with the existing codebase patterns. This is an automated AI review by |
Use a UUID to make memory address edge-case not a problem.
|
GitHub correctly noticed that SnowflakeConnection overloads |
| def _query(sql: str, params: Any = None) -> DataFrame: | ||
| def _query( | ||
| # Dummy parameter to retain per-instance caching. | ||
| instance_id: int, # noqa: ARG001 |
There was a problem hiding this comment.
Type mismatch: instance_id is typed as int, but self._connection_instance_id is a uuid.UUID object (created via uuid.uuid4() in base_connection.py line 79). This will cause type checker failures and is misleading.
# Fix: Change type annotation to match actual type
instance_id: uuid.UUID, # noqa: ARG001Alternatively, convert the UUID to int when passing:
return _query(id(self._connection_instance_id), sql, params)| instance_id: int, # noqa: ARG001 | |
| instance_id: uuid.UUID, # noqa: ARG001 |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
|
It's not used in SiS, and it doesn't do anything that |
Good to know 👍 I will add something to make this deprecation a bit more official: #14125 |
## Describe your changes Deprecates `SnowparkConnection` in favor of `SnowflakeConnection`: - Adds `show_deprecation_warning` call in `__init__` with `show_once=True` - Adds `.. warning::` directive to the class docstring pointing users to `SnowflakeConnection` - Adds unit test to verify the deprecation warning is shown Context: #14094 (comment) ## Testing Plan - [x] Unit test (`SnowparkConnectionDeprecationTest`)
Describe your changes
Include an instance ID parameter in the cache key for SqlConnection.query's internal cache. This prevents cache leakage between two different database connections running the same query.
This may result in some new cache misses. Note however that since
connection_factory/st.connectioncache resources globally, this will only result in extra cache misses in cases where users are constructing SqlConnections manually and not throughst.connection.GitHub Issue Link (if applicable)
Testing Plan
See PR.
Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.