Add OIDC refresh user/token functionality#12696
Add OIDC refresh user/token functionality#12696velochy wants to merge 1 commit intostreamlit:developfrom
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
lib/streamlit/user_info.py
Outdated
| except KeyError: | ||
| raise AttributeError(f'Token "{key}" is not exposed or does not exist.') | ||
|
|
||
| def __setattr__(self, name: str, value: str | None) -> NoReturn: |
There was a problem hiding this comment.
The __setattr__ method parameter value should have type annotation Any instead of str | None. The __setattr__ method can receive values of any type during attribute assignment, so the type annotation should reflect this broader scope rather than restricting it to strings and None.
| def __setattr__(self, name: str, value: str | None) -> NoReturn: | |
| def __setattr__(self, name: str, value: Any) -> NoReturn: |
Spotted by Diamond (based on custom rule: Python Guide)
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
Pull Request Overview
This PR adds OIDC token refresh functionality to Streamlit, allowing users to refresh their authentication tokens and user information without requiring a full re-login. The implementation includes a new st.user.refresh() command and corresponding backend infrastructure.
Key changes:
- Adds
AuthRefreshHandlerto handle token refresh requests at/auth/refreshendpoint - Implements token storage in separate cookies with configurable token exposure
- Enhances logout functionality to support provider-level logout via OIDC end_session_endpoint
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
lib/streamlit/web/server/oauth_authlib_routes.py |
Core implementation of refresh handler and enhanced logout with provider logout support |
lib/streamlit/user_info.py |
Adds TokensProxy class and refresh() method to user interface |
lib/streamlit/auth_util.py |
Adds configuration parsing for token exposure settings |
lib/streamlit/web/server/server.py |
Registers new refresh endpoint route |
lib/streamlit/web/server/server_util.py |
Adds constant for tokens cookie name |
lib/streamlit/web/server/browser_websocket_handler.py |
Integrates token filtering and exposure in WebSocket handler |
lib/tests/streamlit/web/server/oauth_authlib_routes_test.py |
Comprehensive test coverage for refresh and enhanced logout functionality |
lib/tests/streamlit/user_info_test.py |
Tests for TokensProxy and user refresh method |
lib/tests/streamlit/auth_util_test.py |
Tests for token exposure configuration parsing |
|
|
||
| from __future__ import annotations | ||
|
|
||
| import json |
There was a problem hiding this comment.
Duplicate import of 'json' module. The module is imported again on line 36.
|
|
||
| def test_auth_callback_failure_missing_state(self): | ||
| """Test auth callback failure missing state.""" | ||
| """Test auth callback redirects to base when state is missing (logout redirect).""" |
There was a problem hiding this comment.
The test behavior has changed from returning a 400 error to redirecting with 302. This change should be documented in the test comment to explain why missing state now results in a redirect instead of an error.
| """Test auth callback redirects to base when state is missing (logout redirect).""" | |
| """ | |
| Test auth callback redirects to base when state is missing (logout redirect). | |
| Previously, missing state resulted in a 400 error. Now, the handler treats missing state | |
| as a logout redirect and returns a 302 redirect to the base URL ("/") instead of an error. | |
| This change ensures a smoother user experience when the state parameter is absent. | |
| """ |
| self.set_serialized_cookie(TOKENS_COOKIE_NAME, tokens) | ||
|
|
||
| def set_serialized_cookie(self, cookie_name: str, value: dict[str, Any]) -> None: | ||
| """Set a serialized cookie with a value that is less than 4096 bytes.""" |
There was a problem hiding this comment.
The docstring suggests the method enforces a 4096-byte limit, but the implementation only logs an error without preventing the cookie from being set. The docstring should clarify that this method logs a warning for oversized cookies but still sets them.
| """Set a serialized cookie with a value that is less than 4096 bytes.""" | |
| """ | |
| Set a serialized cookie. Logs an error if the serialized value exceeds 4096 bytes, | |
| but still sets the cookie. Browsers may reject cookies larger than 4096 bytes. | |
| """ |
| self, client: TornadoOAuth2App, id_token: str | ||
| ) -> dict[str, Any]: | ||
| jwks_uri = client.server_metadata.get("jwks_uri") | ||
| jwks = requests.get(jwks_uri, timeout=10).json() |
There was a problem hiding this comment.
Missing error handling for the JWKS URI request. If the request fails or returns invalid JSON, the method will raise an unhandled exception. Consider adding try-catch to provide more informative error messages.
| jwks = requests.get(jwks_uri, timeout=10).json() | |
| try: | |
| response = requests.get(jwks_uri, timeout=10) | |
| response.raise_for_status() | |
| jwks = response.json() | |
| except requests.RequestException as e: | |
| _LOGGER.error("Failed to fetch JWKS URI: %s", e) | |
| raise StreamlitAuthError("Failed to fetch JWKS URI") from e | |
| except ValueError as e: | |
| _LOGGER.error("Invalid JWKS JSON from URI: %s", e) | |
| raise StreamlitAuthError("Invalid JWKS JSON from URI") from e |
| context.user_info.clear() | ||
| session_id = context.session_id | ||
|
|
||
| if runtime.exists(): | ||
| instance = runtime.get_instance() | ||
| instance.clear_user_info_for_session(session_id) | ||
|
|
||
| base_path = config.get_option("server.baseUrlPath") | ||
|
|
||
| fwd_msg = ForwardMsg() | ||
| fwd_msg.auth_redirect.url = make_url_path(base_path, AUTH_REFRESH_ENDPOINT) | ||
| context.enqueue(fwd_msg) |
There was a problem hiding this comment.
The user_info is cleared before checking if the user is logged in or if a refresh token exists. This could result in clearing user information even when refresh fails, leaving the user in an inconsistent state.
| context.user_info.clear() | |
| session_id = context.session_id | |
| if runtime.exists(): | |
| instance = runtime.get_instance() | |
| instance.clear_user_info_for_session(session_id) | |
| base_path = config.get_option("server.baseUrlPath") | |
| fwd_msg = ForwardMsg() | |
| fwd_msg.auth_redirect.url = make_url_path(base_path, AUTH_REFRESH_ENDPOINT) | |
| context.enqueue(fwd_msg) | |
| # Only clear user info and proceed if user is logged in and has a refresh token | |
| user_info = context.user_info | |
| is_logged_in = getattr(user_info, "is_logged_in", False) | |
| has_refresh_token = bool(getattr(user_info, "refresh_token", None)) | |
| if is_logged_in and has_refresh_token: | |
| user_info.clear() | |
| session_id = context.session_id | |
| if runtime.exists(): | |
| instance = runtime.get_instance() | |
| instance.clear_user_info_for_session(session_id) | |
| base_path = config.get_option("server.baseUrlPath") | |
| fwd_msg = ForwardMsg() | |
| fwd_msg.auth_redirect.url = make_url_path(base_path, AUTH_REFRESH_ENDPOINT) | |
| context.enqueue(fwd_msg) |
## Describe your changes Store the OIDC auth response in a cookie and expose them to users for API calls Currently, from user perspective, `st.user` gets an extra field called "tokens" that is a dict of all the tokens that were made available - most often the triplet of acces_token, id_token, refresh_token This is useful for multiple use cases (see the issues below). To achieve (a) reliably, the tokens are stored in a separate cookie from user info. This is because cookies are limited to be <4096 bytes and I ran into that limit in my own testing already if I combined the two dictionaries. Additionally, both cookies can now be chunked so as a by-product, this now supports login systems with larger userdata. I have tested it to work in my own use case and it seems to be performing as intended. But it is just one use case and I may well be missing some others. so any proposals to improve this are very welcome. Originally, also contained st.user_refresh command, but that got it's own PR: #12696 ## GitHub Issue Link (if applicable) Closes #10378 Exposing access_token: #10378 Too large usedata for current login system: #12518 Need to provide id token hint at logout: #12144 (comment) -> PR #12693 Need for user refresh: #12043 -> PR #12696 ## Testing Plan - Unit tests present - Manually tested to work for my own use case --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license. --------- Co-authored-by: Ken McGrady <[email protected]>
c1d5a7c to
dc3fb47
Compare
dc3fb47 to
a938678
Compare
|
We also would be interested for a solution:
We show two fixed examples here: ` TOKENS_COOKIE_NAME in server_util.py |
| fwd_msg = ForwardMsg() | ||
| fwd_msg.auth_redirect.url = make_url_path(base_path, AUTH_REFRESH_ENDPOINT) | ||
| context.enqueue(fwd_msg) |
There was a problem hiding this comment.
Why we do here an redirect to the AUTH_REFRESH_ENDPOINT and the back to base_url instead of calling OIDC token endpoint directly (AuthRefreshHandler->get) and refresh the user token?
Token refresh can and should be done in the background (without redirects in browser). Therefore refresh tokens are made.
There was a problem hiding this comment.
This is an excellent question @tobka777.
I implemented this right after implementing logout where a redirect was needed (as it is for login). So I built refresh to the same logical pattern without giving it a second thought. Especially considering that in my own use case (where I use it to refresh user info) I would need to call a page refresh anyway to update the user info displayed.
But you are right. For refresh, the redirect seems completely un-necessary. And I agree - if it can be done on the background without a redirect, it should be doable that way.
I'll try to come up with a solution. Thank you for calling me on this :)
There was a problem hiding this comment.
Why we do here an redirect to the AUTH_REFRESH_ENDPOINT and the back to base_url instead of calling OIDC token endpoint directly (AuthRefreshHandler->get) and refresh the user token?
Token refresh can and should be done in the background (without redirects in browser). Therefore refresh tokens are made.
So turns out there is actually a reason to use a redirect, at least according to my AI assistant. Namely - ScriptRunner environment does not allow accessing and changing the cookies. Redirect is one way out of it. The other path seems to be passing a message via ProtoBuf, but that seems to involve creating a new message type and seems to go down into a rabbithole of touching dozens of files. Which seems like unnecessary complexity.
It is, of course, possible my little AI helper is wrong here. So maybe one of the core devs (like @kmcgrady ?) could chime in here. The TL:DR version of the question is: "What is the shortest path to changing auth cookies from user_info.py that does not involve a redirect, and would it be worth it for a token refresh functionality?"
I do not understand what you are talking about. The main branch (called "develop") has TOKENS_COOKIE_NAME exactly where it is needed (in both server_util.py as well as starlette_server_config.py)
I do not understand what you mean here either. Code works perfectly well as things currently stand. My feeling is you had AI compare code against the wrong branch or something else odd. |
Indeed we had an old version locally running where it was not implemented. |
| return None | ||
|
|
||
| def decode_id_token( | ||
| self, client: TornadoOAuth2App, id_token: str |
There was a problem hiding this comment.
We would propose the following change:
def decode_id_token(self, client: TornadoOAuth2App, id_token: str) -> dict[str, Any]:
jwks_uri = client.server_metadata.get("jwks_uri")
jwk_client = PyJWKClient(uri=jwks_uri)
signing_key = jwk_client.get_signing_key_from_jwt(token=id_token)
return jwt.decode(
id_token,
key=signing_key.key,
algorithms=[signing_key.algorithm_name],
audience=client.client_id,
)
Reason:
the current function returns an decoding error on some jwks endpoint due to missing arguments such as algorithms and audience
There was a problem hiding this comment.
from jwt import PyJWKClient
This is the necessary import which need to be done in the beginning
|
Adding a real-world use case: We're running a Streamlit app (v1.53+) with GitLab (self-hosted) as our OIDC provider and using The problem we're hitting is exactly what this issue describes: GitLab access tokens expire after 2 hours, but Streamlit's identity cookie persists for 30 days. After the token expires, We tried adding tokens = {k: token[k] for k in ["id_token", "access_token"] if k in token}So the We see that PR #12696 adds Looking forward to seeing this land -- it would be a big improvement for any app using short-lived access tokens with OIDC providers like GitLab, GitHub, or Okta. |
|
Also very interested to finally see land here... |
|
This pull request has had no activity for 14 days, so it has been marked as stale. If you still want to continue this work, please leave a comment or push a commit within 7 days. A maintainer can also apply the |
|
Still interested in doing this, but need maintainer input. |
|
We would also greatly appreciate that for exactly the same reasons @janbernloehr mentioned. In our case, customers have to log in to MSFT Entra again after a certain amount of time, which could lead to data loss. Therefore, it would be great if the |
|
We implemented a PR #14437 based on this PR and now its works as intended for us: Calling st.user.refresh() smoothly updates st.user and the tokens without reloading the application. |
Describe your changes
Add a command (
st.user.refresh()) to use the refresh token to refresh both the tokens and the user infoThis was separated out of #12044 at the request of maintainers.
Needs a product decision on where to place the command - either top level st.refresh_user() or st.user.refresh() (or some third option).
GitHub Issue Link (if applicable)
#12043
Testing Plan
Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.