fix(sso): direct PKCE token exchange + Redis wiring for multi-instance SSO by ishaan-jaff · Pull Request #22923 · BerriAI/litellm

ishaan-jaff · 2026-03-05T19:32:18Z

Relevant issues

Fixes: PKCE (Proof Key for Code Exchange) token exchange failing with Okta and other PKCE-required providers because fastapi-sso drops the code_verifier parameter.

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting `@greptileai` and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🐛 Bug Fix

Changes

Two fixes for PKCE-based SSO (Okta and other providers):

1. Direct token exchange when PKCE is enabled

fastapi-sso silently drops code_verifier from the token exchange request, causing PKCE-required providers to reject the auth with "invalid_grant". When GENERIC_CLIENT_USE_PKCE=true and a code_verifier is found in cache, the callback now calls the token endpoint directly via httpx — bypassing fastapi-sso.verify_and_process — with the full PKCE parameters.

2. Redis-backed verifier storage for multi-pod deployments

Previously, code_verifier was stored in user_api_key_cache (in-memory). In multi-pod deployments, the login and callback can land on different pods, causing a cache miss. Now redis_usage_cache is used when available (falling back to in-memory for single-instance deploys).

Additional improvements:

PKCE_STRICT_CACHE_MISS=true opt-in to fail with HTTP 401 on cross-pod cache misses instead of silently proceeding
Startup warning when GENERIC_CLIENT_USE_PKCE=true but Redis is not configured
Bearer credentials stripped from error messages to prevent credential leakage in 403 responses
22 new tests covering all PKCE paths including cache miss scenarios, public clients, id_token fallback, null JSON responses, and credential isolation

…ulti-instance SSO When PKCE is enabled, bypass fastapi-sso and perform direct token exchange so code_verifier is correctly included. Store PKCE verifiers as dict in cache for proper JSON serialization in Redis. Wire user_api_key_cache to Redis when available so PKCE verifiers are shared across ECS tasks/pods. Also adds clearer error messages when PKCE is required but not configured.

vercel · 2026-03-05T19:32:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 7, 2026 1:43am

…ler methods - Move import httpx/jwt to module level (top of file, not inside function) - Extract inline PKCE token exchange + userinfo logic into two static methods: _pkce_token_exchange() and _get_pkce_userinfo() - get_generic_sso_response PKCE path is now a single method call - Fix double-logging in except block for non-PKCE errors - Use %-style log formatting (no f-strings in log calls)

greptile-apps · 2026-03-05T19:43:03Z

Greptile Summary

This PR fixes PKCE token exchange for Okta and other PKCE-required providers by implementing a direct httpx-based token exchange that includes the code_verifier parameter (which fastapi-sso silently drops), and adds Redis-backed storage for PKCE verifiers in multi-pod deployments.

Key implementation:

New _pkce_token_exchange() and _get_pkce_userinfo() methods perform direct OAuth token exchange with proper code_verifier inclusion, HTTP 200 error-body detection, and id_token JWT fallback
PKCE verifiers stored in redis_usage_cache (with in-memory fallback) for cross-instance access, with explicit three-way merge logic to prevent bearer credential confusion
Bearer credentials stripped from error messages to prevent exposure in 403 responses
Explicit type guards for token_response dict validation (guards against JSON null responses)
22 new tests covering PKCE paths including Basic Auth, credentials-in-body, HTTP-200-with-error-body, id_token fallback, cache miss scenarios, and credential isolation

Code quality observations:

Exception handling is well-structured with proper guards for non-dict token responses (line 2855)
Authorization code guard correctly placed inside the PKCE conditional block to preserve non-PKCE error callback behavior
Bearer credential precedence correctly enforced via three-way merge logic in _pkce_token_exchange (lines 2924-2933)
Try/except scoping properly isolates the POST call from context manager teardown exceptions
Verifier deletion intentionally deferred to after downstream processing for defensive record-keeping

Confidence Score: 4/5

No critical issues identified. Code implements PKCE token exchange correctly with proper guard clauses and error handling.
The implementation is sound: authorization flow correctly gates PKCE-specific logic within the if code_verifier: block, bearer credentials are properly protected through three-way merge logic that ensures token endpoint values take precedence, and defensive type guards prevent common edge cases like JSON null responses. Exception handling and try/except scoping are correct. The 22-test suite provides good coverage of PKCE paths.
No files require special attention

_{Last reviewed commit: 22e103f}

litellm/proxy/management_endpoints/ui_sso.py

ishaan-jaff · 2026-03-05T19:47:51Z

@greptile review

litellm/proxy/management_endpoints/ui_sso.py

litellm/proxy/proxy_server.py

- Fix access_token missing in PKCE path: read from combined_response directly instead of generic_sso.access_token (which is only set by verify_and_process) - Fix PKCE error hint firing when PKCE is already enabled: only show 'set GENERIC_CLIENT_USE_PKCE=true' advice when code_verifier was absent - Fix unguarded KeyError on access_token: check for error field in HTTP 200 responses before accessing token_response['access_token'] - Fix silent empty userinfo: raise ProxyException when both userinfo endpoint and id_token fallback produce no user data - Fix backward-incompatible Redis wiring: only attach Redis to user_api_key_cache when GENERIC_CLIENT_USE_PKCE=true, preserving existing in-memory behaviour

ishaan-jaff · 2026-03-05T19:56:33Z

@greptile review

litellm/proxy/management_endpoints/ui_sso.py

litellm/proxy/proxy_server.py

- Fix PKCE error hint: check env var directly (not code_verifier presence) to distinguish 'PKCE not configured' from 'PKCE enabled but cache miss' - Fix misleading Redis TTL comment in proxy_server.py

ishaan-jaff · 2026-03-05T20:04:15Z

@greptile review

litellm/proxy/management_endpoints/ui_sso.py

litellm/proxy/proxy_server.py

- Fix CRITICAL log firing on every non-PKCE callback: only log when PKCE is enabled - Remove unused pkce_env_value intermediate variable - Prefer reusing redis_usage_cache over creating separate RedisCache instance (avoids losing advanced connection options like SSL, timeouts, db)

ishaan-jaff · 2026-03-05T20:11:22Z

@greptile review

litellm/proxy/management_endpoints/ui_sso.py

litellm/proxy/proxy_server.py

- Strip OAuth token credentials from response_convertor input to prevent access_token/id_token appearing in restricted-group error messages - Reuse single httpx.AsyncClient for both token exchange and userinfo requests to avoid a second TCP/TLS handshake per SSO callback - Revert Redis wiring to user_api_key_cache: PKCE code already uses redis_usage_cache directly; wiring would route all API-key lookups through Redis unnecessarily. Add startup warning instead when PKCE+Redis mismatch. - Move _OAUTH_TOKEN_FIELDS to module level

ishaan-jaff · 2026-03-05T20:25:00Z

@greptile review

ishaan-jaff · 2026-03-05T20:33:19Z

@greptile review

litellm/proxy/management_endpoints/ui_sso.py

litellm/proxy/proxy_server.py

litellm/proxy/management_endpoints/ui_sso.py

ishaan-jaff · 2026-03-05T21:45:48Z

@greptile review

greptile-apps · 2026-03-06T23:57:03Z

litellm/proxy/management_endpoints/ui_sso.py

+        if response is None:
+            # Should never happen in practice — construction failure is unexpected.
+            raise ProxyException(
+                message="Token endpoint request did not return a response",
+                type=ProxyErrorTypes.auth_error,
+                param="token_exchange",
+                code=status.HTTP_401_UNAUTHORIZED,
+            )


response is initialized to None at line 2804 before the async with block. If httpx.AsyncClient() construction fails (line 2805), the exception propagates immediately without entering the block body, leaving response = None. However, that exception would be caught by the outer try/except at line 788 in get_generic_sso_response, preventing execution from ever reaching the guard at line 2820.

The guard at lines 2820–2827 cannot be reached in practice because:

If AsyncClient() construction succeeds, the async with body is entered

Inside the body, either response is assigned (line 2807) or a ProxyException is raised (line 2818)

If AsyncClient() construction fails, the exception propagates before the guard is reached

The comment misleadingly suggests this guard handles the construction-failure case, but that scenario would cause the exception to propagate to the outer handler, not continue to line 2820. The guard is dead code and the accompanying comment creates a false expectation for future maintainers.

Suggested fix: Remove the unreachable guard, or restructure exception handling to actually reach this point (e.g., catch construction failures at a higher level before delegating to _pkce_token_exchange).

ishaan-jaff · 2026-03-07T00:02:38Z

@greptile review

greptile-apps · 2026-03-07T00:10:55Z

litellm/proxy/management_endpoints/ui_sso.py

+            if cached_data:
+                # Extract code_verifier from dict (stored as dict for JSON serialization)
+                if isinstance(cached_data, dict) and "code_verifier" in cached_data:
+                    code_verifier = cached_data["code_verifier"]
+                    if not code_verifier:
+                        # Dict format is correct but value is empty or null.  This is
+                        # a distinct case from an unrecognized format — the entry exists
+                        # but was stored with an empty/null verifier (data integrity issue).
+                        _empty_value_in_dict = True
+                        verbose_proxy_logger.warning(
+                            "PKCE verifier dict for state '%s' has an empty/null code_verifier "
+                            "value — may indicate a storage bug. Treating as a cache miss.",
+                            state,
+                        )
+                    else:
+                        verbose_proxy_logger.debug("PKCE code_verifier retrieved from cache")
+                elif isinstance(cached_data, str):
+                    # Handle legacy format (plain string) for backward compatibility
+                    code_verifier = cached_data
+                    verbose_proxy_logger.warning(
+                        "Retrieved code_verifier in legacy plain-string format. "
+                        "Future storage will use dict format."
+                    )
+                else:
+                    # Defer the detailed ERROR log to the strict-mode branch below
+                    # (which includes state and a diagnostic message).  Log at DEBUG
+                    # here to avoid duplicate ERROR entries in the same request.
+                    verbose_proxy_logger.debug(
+                        "Unexpected PKCE verifier cache format (type=%s); skipping.",
+                        type(cached_data).__name__,
+                    )

            if code_verifier:
-                # Add code_verifier to token exchange parameters (Redis returns decoded string)
-                token_params["code_verifier"] = (
-                    code_verifier
-                    if isinstance(code_verifier, str)
-                    else str(code_verifier)
-                )
-                verbose_proxy_logger.debug(
-                    "PKCE code_verifier retrieved and will be included in token exchange"
+                # Add code_verifier to token exchange parameters.
+                token_params["code_verifier"] = code_verifier
+                # Return the cache key so the caller can delete it *after* a
+                # successful token exchange (avoids losing the verifier on retry
+                # if the exchange fails partway through).
+                token_params["_pkce_cache_key"] = cache_key


Stale empty-valued cache entry is never cleaned up in non-strict mode.

When a cached dict contains an empty/null code_verifier value, the code logs a warning indicating a potential storage bug, but the stale cache entry is never deleted. The issue: when code_verifier is falsy, the cache key is not returned to the caller, so _delete_pkce_verifier is never called.

The warning message already documents this as "may indicate a storage bug," but provides no cleanup action. Unlike the strict-mode path which raises an error, the non-strict path allows the request to continue while leaving the corrupted cache entry in memory until its 600-second TTL expires.

Consider implementing best-effort cleanup of the stale entry in the non-strict path after logging the warning, so corrupted or partially-stored verifiers are actively removed rather than left to natural TTL expiry.

ishaan-jaff · 2026-03-07T00:12:37Z

@greptile review

greptile-apps · 2026-03-07T00:23:24Z

tests/test_litellm/proxy/management_endpoints/test_ui_sso.py

+        mock_cache = MagicMock()
+        mock_cache.async_get_cache = AsyncMock(return_value=12345)
+
+        mock_request = MagicMock(spec=Request)
+        mock_request.query_params = {"state": "bad_format_non_strict"}
+
+        # Non-strict mode: should log a warning and continue, not raise.
+        # Use patch.dict with PKCE_STRICT_CACHE_MISS="false" to avoid permanently
+        # mutating the test process environment with os.environ.pop().
+        with caplog.at_level(logging.WARNING), patch(
+            "litellm.proxy.proxy_server.redis_usage_cache", None
+        ), patch(
+            "litellm.proxy.proxy_server.user_api_key_cache", mock_cache
+        ), patch.dict(
+            os.environ,
+            {"GENERIC_CLIENT_USE_PKCE": "true", "PKCE_STRICT_CACHE_MISS": "false"},
+            clear=False,
+        ):
+            result = await SSOAuthenticationHandler.prepare_token_exchange_parameters(
+                request=mock_request, generic_include_client_id=False
+            )
+
+        # No raise in non-strict mode; verifier simply absent from params
+        assert "code_verifier" not in result
+        assert "_pkce_cache_key" not in result
+        # Cache was queried (the unexpected format was retrieved and logged at WARNING)
+        mock_cache.async_get_cache.assert_called_once()


Missing async mock setup causes silent cleanup failure in test

This test creates a MagicMock for the cache without setting up async_delete_cache as an AsyncMock. When the test runs with unexpected format data (integer 12345), the production code calls _delete_pkce_verifier() which attempts to await the delete method. Since the mock doesn't provide an awaitable, a TypeError is raised and silently caught by the exception handler — meaning the cleanup path never actually executes.

A regression that breaks the cleanup code would go completely undetected by this test.

Recommendation: Configure the mock to properly handle the async cleanup call:

Add mock_cache.async_delete_cache = AsyncMock() after line 3919

Add an assertion after the test to verify cleanup was called with the correct cache key

This matches the pattern already correctly used in test_prepare_token_exchange_parameters around line 3142

…p called

ishaan-jaff · 2026-03-07T00:27:04Z

@greptile review

greptile-apps · 2026-03-07T00:37:38Z

litellm/proxy/proxy_server.py

+            # all API-key lookups through Redis.
+            use_pkce = os.getenv("GENERIC_CLIENT_USE_PKCE", "false").lower() == "true"
+            if use_pkce and redis_usage_cache is None:
+                verbose_proxy_logger.warning(
+                    "GENERIC_CLIENT_USE_PKCE=true but Redis is not configured for LiteLLM caching. "
+                    "PKCE verifiers will not be shared across instances — callbacks may land on a "
+                    "different pod than the login request and fail silently. "
+                    "Configure Redis via the 'cache' section in your proxy config, "
+                    "or enable sticky sessions for single-instance deployments. "
+                    "Set PKCE_STRICT_CACHE_MISS=true to fail fast with a 401 on cache misses "
+                    "instead of continuing without a code_verifier."
                )


PKCE warning fires on every config hot-reload, not just startup

load_config() is called on startup and on every config-file hot-reload. This means the warning at lines 3033-3040 will emit repeatedly in long-running deployments that use dynamic config reloads, cluttering logs with repeated "startup" advisories.

Consider guarding with a module-level sentinel so the warning fires only once per process:

_pkce_redis_warning_emitted = False def load_config(...): ... global _pkce_redis_warning_emitted use_pkce = os.getenv("GENERIC_CLIENT_USE_PKCE", "false").lower() == "true" if use_pkce and redis_usage_cache is None and not _pkce_redis_warning_emitted: _pkce_redis_warning_emitted = True verbose_proxy_logger.warning(...)

…t-reloads

ishaan-jaff · 2026-03-07T00:40:17Z

@greptile review

greptile-apps · 2026-03-07T00:50:50Z

litellm/proxy/management_endpoints/ui_sso.py

+    # Initialized here so it's visible in the except block for error-hint logic
+    code_verifier: Optional[str] = None


The comment at line 785 states code_verifier is initialized "so it's visible in the except block for error-hint logic," but the except block (lines 879-933) never references this variable. The error-hint logic uses pkce_configured = os.getenv(...) and error_message = str(e) instead. The initialization is harmless but the comment creates a false expectation for future maintainers.

Suggested change

# Initialized here so it's visible in the except block for error-hint logic

code_verifier: Optional[str] = None

# Initialize before try block to ensure it's available for use in the PKCE token

# exchange path (line 795 onwards).

code_verifier: Optional[str] = None

greptile-apps · 2026-03-07T00:50:51Z

litellm/proxy/management_endpoints/ui_sso.py

+        # Merge: userinfo takes precedence for identity claims (sub, email, name, …) per
+        # the OpenID Connect spec (userinfo is the authoritative source for identity).
+        # But bearer credentials (access_token, id_token, refresh_token) must always come
+        # from the token endpoint, not from userinfo (non-standard providers occasionally
+        # include these fields in userinfo, which would otherwise shadow the real bearer token).
+        #
+        # Three-way merge semantics for each bearer-credential field:
+        #   1. token_response has a non-null value → use it (token endpoint is authoritative)
+        #   2. token_response explicitly sent null  → remove the key so callers get a clean
+        #      absence signal; the null from the token endpoint overrides userinfo too
+        #   3. field absent from token_response     → leave whatever userinfo provided as-is
+        #      (e.g. userinfo-provided id_token from a non-standard provider)


The comment at line 2908 states "bearer credentials (access_token, id_token, refresh_token) must always come from the token endpoint, not from userinfo," but Case 3 in the merge loop (lines 2916-2926) explicitly contradicts this by accepting any of these fields from userinfo when absent from token_response.

For example, many providers omit refresh_token from the token response (it's optional). If a non-standard provider includes refresh_token in userinfo, Case 3 would silently accept it into the merged response. A refresh_token field in userinfo is not a standard OIDC field and could allow userinfo providers to inject long-lived credentials.

Update the comment to either:

Explicitly document Case 3 as an intentional fallback for non-standard providers that embed credentials in userinfo (and note the security implications), or

Modify the loop to guard refresh_token by accepting it only from token_response, never from userinfo, regardless of key absence

Suggested change

# Merge: userinfo takes precedence for identity claims (sub, email, name, …) per

# the OpenID Connect spec (userinfo is the authoritative source for identity).

# But bearer credentials (access_token, id_token, refresh_token) must always come

# from the token endpoint, not from userinfo (non-standard providers occasionally

# include these fields in userinfo, which would otherwise shadow the real bearer token).

#

# Three-way merge semantics for each bearer-credential field:

# 1. token_response has a non-null value → use it (token endpoint is authoritative)

# 2. token_response explicitly sent null → remove the key so callers get a clean

# absence signal; the null from the token endpoint overrides userinfo too

# 3. field absent from token_response → leave whatever userinfo provided as-is

# (e.g. userinfo-provided id_token from a non-standard provider)

# Merge: userinfo takes precedence for identity claims (sub, email, name, …) per

# the OpenID Connect spec (userinfo is the authoritative source for identity).

# For bearer credentials, apply a three-way precedence:

# 1. token_response has a non-null value → use it (authoritative)

# 2. token_response explicitly sends null → remove the key

# 3. field absent from token_response → use token_response value only;

# do not accept bearer credentials from userinfo (non-standard providers

# must not be trusted to provide access_token, id_token, or refresh_token)

…ge docs

ishaan-jaff · 2026-03-07T00:57:01Z

@greptile review

greptile-apps · 2026-03-07T01:16:58Z

litellm/proxy/management_endpoints/ui_sso.py

+                if strict_cache_miss:
+                    # Distinguish empty-value dicts, corrupt-format entries, and genuine
+                    # cache misses so operators can investigate the correct root cause.
+                    if _empty_value_in_dict:
+                        # Dict format was correct but code_verifier was empty/null.
+                        raise ProxyException(
+                            message=(
+                                f"PKCE verifier for state '{state}' was found in cache but "
+                                f"has an empty or null code_verifier value — possible storage bug."
+                            ),
+                            type=ProxyErrorTypes.auth_error,
+                            param="PKCE_CACHE_MISS",
+                            code=status.HTTP_401_UNAUTHORIZED,
+                        )
+                    elif cached_data is not None:
+                        # Cache had data but in an unrecognised format (e.g. corrupt Redis value).
+                        verbose_proxy_logger.error(
+                            "PKCE verifier for state '%s' has an unrecognized format (type=%s); "
+                            "treating as a cache miss. Investigate the cached value — it may be "
+                            "a corrupt or stale entry.",
+                            state,
+                            type(cached_data).__name__,
+                        )
+                        raise ProxyException(
+                            message=(
+                                f"PKCE verifier for state '{state}' has an unrecognized format "
+                                f"(type={type(cached_data).__name__}). The cached entry may be corrupt."
+                            ),
+                            type=ProxyErrorTypes.auth_error,
+                            param="PKCE_CACHE_MISS",
+                            code=status.HTTP_401_UNAUTHORIZED,
+                        )


Corrupt verifier not cleaned up in strict mode

In non-strict mode (line 2682), when cached_data is not None but unusable, _delete_pkce_verifier is called to evict the corrupt entry so it doesn't linger until TTL expiry. The strict-mode branches at lines 2620–2648 raise ProxyException for the same conditions (_empty_value_in_dict and unrecognized format) but skip that cleanup call.

The corrupt entry stays in Redis for up to 600 s. In strict mode operators are already aware something is wrong and may re-examine the cache for diagnosis — a dangling entry makes that harder. The fix is symmetric with the non-strict cleanup:

if _empty_value_in_dict: await SSOAuthenticationHandler._delete_pkce_verifier(cache_key) # best-effort raise ProxyException( message=..., ... ) elif cached_data is not None: await SSOAuthenticationHandler._delete_pkce_verifier(cache_key) # best-effort verbose_proxy_logger.error(...) raise ProxyException(...)

Since _delete_pkce_verifier already swallows its own exceptions, this does not change the raise semantics.

greptile-apps · 2026-03-07T01:16:59Z

litellm/proxy/management_endpoints/ui_sso.py

+        # The try/except is INSIDE the async with so that TLS teardown exceptions
+        # from __aexit__ propagate as-is and are NOT mis-labelled as "Token endpoint
+        # request failed".  httpx buffers the full response body before __aexit__,
+        # so status_code / text / json() remain valid after the context exits.
+        async with httpx.AsyncClient() as http_client:
+            try:
+                response = await http_client.post(token_endpoint, **post_kwargs)
+            except Exception as exc:
+                # Catch network-level errors (SSL, DNS, TCP, timeout, etc.) and
+                # wrap them as a clean ProxyException rather than leaking raw
+                # httpx or OS exceptions to callers.
+                verbose_proxy_logger.error("PKCE token endpoint unreachable: %s", exc)
+                raise ProxyException(
+                    message=f"Token endpoint request failed: {exc}",
+                    type=ProxyErrorTypes.auth_error,
+                    param="token_exchange",
+                    code=status.HTTP_401_UNAUTHORIZED,
+                ) from exc
+
+        # Response processing outside the async with — httpx buffers the full
+        # response body so status_code / text / json() remain valid after __aexit__.
+        if response.status_code != 200:
+            verbose_proxy_logger.error(
+                "PKCE token exchange failed. status=%s body=%s",
+                response.status_code,
+                response.text[:500],
+            )
+            raise ProxyException(
+                message=f"Token exchange failed: {response.status_code} - {response.text[:500]}",
+                type=ProxyErrorTypes.auth_error,
+                param="token_exchange",


response consumed outside async with block — comment contradicts actual scope

response is assigned inside the async with httpx.AsyncClient() as http_client: block (line ~2807) but consumed (.status_code, .text, .json()) after the block closes. The inline comment correctly notes that httpx buffers the full response body before __aexit__, so reading these attributes after the client is closed is safe today.

The risk is forward-maintenance: a future refactor to streaming (stream=True) would silently break these post-block accesses. Consider moving the non-200 check, JSON parse, and access_token validation inside the async with block to make the lifetime dependency explicit, or at minimum add a short explanatory comment near the consumption site reinforcing the buffering guarantee:

# response.status_code / .text / .json() are safe here: httpx buffers the full # body before __aexit__, so these attributes remain valid after the client closes. if response.status_code != 200:

greptile-apps · 2026-03-07T01:17:00Z

tests/test_litellm/proxy/management_endpoints/test_ui_sso.py

+        mock_cache = MagicMock()
+        mock_cache.async_get_cache = AsyncMock(return_value=12345)
+        mock_cache.async_delete_cache = AsyncMock()
+
+        mock_request = MagicMock(spec=Request)
+        mock_request.query_params = {"state": "bad_format_non_strict"}
+
+        # Non-strict mode: should log a warning and continue, not raise.
+        # Use patch.dict with PKCE_STRICT_CACHE_MISS="false" to avoid permanently
+        # mutating the test process environment with os.environ.pop().
+        with caplog.at_level(logging.WARNING), patch(
+            "litellm.proxy.proxy_server.redis_usage_cache", None
+        ), patch(
+            "litellm.proxy.proxy_server.user_api_key_cache", mock_cache
+        ), patch.dict(
+            os.environ,
+            {"GENERIC_CLIENT_USE_PKCE": "true", "PKCE_STRICT_CACHE_MISS": "false"},
+            clear=False,
+        ):
+            result = await SSOAuthenticationHandler.prepare_token_exchange_parameters(
+                request=mock_request, generic_include_client_id=False
+            )


async_delete_cache not set up as AsyncMock — cleanup call silently swallowed

mock_cache is a plain MagicMock(). When the non-strict path calls await SSOAuthenticationHandler._delete_pkce_verifier(cache_key), it internally awaits active_cache.async_delete_cache(...). Because mock_cache.async_delete_cache is a regular MagicMock (not an AsyncMock), the await raises a TypeError that is silently swallowed by _delete_pkce_verifier's own except Exception.

The cleanup branch is therefore never actually exercised. A regression that breaks the cleanup dispatch in this path would pass the test silently.

Fix: configure async_delete_cache as an AsyncMock and assert it is called:

mock_cache.async_delete_cache = AsyncMock() # … run the test … mock_cache.async_delete_cache.assert_called_once()

This matches the pattern already used correctly in test_prepare_token_exchange_parameters (around line 3142).

ishaan-jaff · 2026-03-07T01:29:45Z

@greptile review

greptile-apps · 2026-03-07T01:36:46Z

tests/test_litellm/proxy/management_endpoints/test_ui_sso.py

+        async def fake_post(*args, **kwargs):
+            # Verify Basic Auth is set
+            assert "auth" in kwargs
+            assert isinstance(kwargs["auth"], httpx.BasicAuth)
+            # Verify code_verifier is in the POST body (essential PKCE field)
+            assert kwargs.get("data", {}).get("code_verifier") == "verifier_abc"
+            # Verify credentials are NOT double-sent in the POST body when using Basic Auth
+            post_data = kwargs.get("data", {})
+            assert "client_secret" not in post_data, "client_secret must not appear in POST body when using Basic Auth"
+            assert "client_id" not in post_data, "client_id must not appear in POST body when using Basic Auth (include_client_id=False)"
+            return mock_response


redirect_uri not asserted in POST body

fake_post thoroughly validates credential placement (auth, code_verifier, absent client_id/client_secret) but does not assert that redirect_uri was forwarded when redirect_url is non-None. Since the production code includes redirect_uri in token_data only via the if redirect_url: guard at line 2780, accidentally removing that guard would silently omit redirect_uri from the POST body — and many OAuth providers reject requests with a redirect_uri_mismatch error in that case.

Add the assertion inside fake_post:

Suggested change

async def fake_post(*args, **kwargs):

# Verify Basic Auth is set

assert "auth" in kwargs

assert isinstance(kwargs["auth"], httpx.BasicAuth)

# Verify code_verifier is in the POST body (essential PKCE field)

assert kwargs.get("data", {}).get("code_verifier") == "verifier_abc"

# Verify credentials are NOT double-sent in the POST body when using Basic Auth

post_data = kwargs.get("data", {})

assert "client_secret" not in post_data, "client_secret must not appear in POST body when using Basic Auth"

assert "client_id" not in post_data, "client_id must not appear in POST body when using Basic Auth (include_client_id=False)"

return mock_response

async def fake_post(*args, **kwargs):

# Verify Basic Auth is set

assert "auth" in kwargs

assert isinstance(kwargs["auth"], httpx.BasicAuth)

# Verify code_verifier is in the POST body (essential PKCE field)

assert kwargs.get("data", {}).get("code_verifier") == "verifier_abc"

# Verify redirect_uri is forwarded to the token endpoint

assert kwargs.get("data", {}).get("redirect_uri") == "https://proxy.example.com/callback", "redirect_uri must be forwarded to the token endpoint"

# Verify credentials are NOT double-sent in the POST body when using Basic Auth

post_data = kwargs.get("data", {})

assert "client_secret" not in post_data, "client_secret must not appear in POST body when using Basic Auth"

assert "client_id" not in post_data, "client_id must not appear in POST body when using Basic Auth (include_client_id=False)"

return mock_response

The same gap exists in test_pkce_token_exchange_credentials_in_body at line 3481 and in test_pkce_token_exchange_public_client_no_secret.

greptile-apps · 2026-03-07T01:36:47Z

tests/test_litellm/proxy/management_endpoints/test_ui_sso.py

+        assert "unavailable" in exc_info.value.message.lower() or "no userinfo" in exc_info.value.message.lower() or "userinfo" in exc_info.value.message.lower()
+        assert str(exc_info.value.code) == "401"


Over-permissive message assertion

The message check "unavailable" in ... or "no userinfo" in ... or "userinfo" in exc_info.value.message.lower() will pass for literally any ProxyException whose message contains the word "userinfo", including unrelated errors. The third clause is so broad that it adds no signal.

The actual error raised by _get_pkce_userinfo when userinfo_endpoint is set but the endpoint returns empty/null and no id_token is present is:

"SSO user info unavailable: userinfo endpoint failed and no id_token was present in the token response."

Tighten the assertion to match the actual message that should be raised:

Suggested change

assert "unavailable" in exc_info.value.message.lower() or "no userinfo" in exc_info.value.message.lower() or "userinfo" in exc_info.value.message.lower()

assert str(exc_info.value.code) == "401"

assert "userinfo endpoint failed" in exc_info.value.message.lower() or "unavailable" in exc_info.value.message.lower()

assert str(exc_info.value.code) == "401"

greptile-apps · 2026-03-07T01:36:48Z

litellm/proxy/management_endpoints/ui_sso.py

+                        verbose_proxy_logger.warning(
+                            "Userinfo endpoint returned %s, falling back to id_token",
+                            resp.status_code,
+                        )


Non-200 userinfo response body omitted from diagnostic log

The warning logs only the HTTP status code when the userinfo endpoint returns a non-200 response. Without the response body it is impossible to distinguish "401 - token expired", "403 - insufficient scope", "429 - rate limited", or "503 - maintenance page" from a log entry alone.

Suggested change

verbose_proxy_logger.warning(

"Userinfo endpoint returned %s, falling back to id_token",

resp.status_code,

)

verbose_proxy_logger.warning(

"Userinfo endpoint returned %s, falling back to id_token. Body: %s",

resp.status_code,

resp.text[:300],

)

greptile-apps · 2026-03-07T01:36:49Z

litellm/proxy/proxy_server.py

 user_custom_auth = None
 user_custom_key_generate = None
+# Sentinel: prevents PKCE-no-Redis advisory from re-logging on config hot-reload
+_pkce_no_redis_warning_emitted: bool = False


Module-level sentinel persists state across tests in the same process

_pkce_no_redis_warning_emitted is a module-level boolean. Once set to True during any test that exercises load_config() with PKCE enabled and no Redis, it stays True for the remainder of the pytest session. Any later test that asserts the advisory warning fires (e.g. a future regression test for the startup warning) will silently fail to observe the log entry because the sentinel prevents it from being emitted.

Consider resetting the sentinel at the top of test setup, or scope it to the load_config call via a local parameter so tests are fully isolated:

_pkce_no_redis_warning_emitted: bool = False

The simplest fix for test authors is to reset the sentinel in test teardown:

import litellm.proxy.proxy_server as ps ps._pkce_no_redis_warning_emitted = False

If a test for this warning is added, it should include that teardown step.

…mment

ishaan-jaff · 2026-03-07T01:41:55Z

@greptile review

vercel bot deployed to Preview March 5, 2026 19:33 View deployment

vercel bot deployed to Preview March 5, 2026 19:38 View deployment

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/management_endpoints/ui_sso.py Show resolved Hide resolved

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/proxy_server.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview March 5, 2026 19:58 View deployment

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/proxy_server.py Outdated Show resolved Hide resolved

fix: address second round of greptile review feedback

50631f8

- Fix PKCE error hint: check env var directly (not code_verifier presence) to distinguish 'PKCE not configured' from 'PKCE enabled but cache miss' - Fix misleading Redis TTL comment in proxy_server.py

vercel bot deployed to Preview March 5, 2026 20:05 View deployment

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/proxy/management_endpoints/ui_sso.py Show resolved Hide resolved

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/proxy_server.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview March 5, 2026 20:12 View deployment

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/proxy_server.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview March 5, 2026 20:26 View deployment

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/proxy/management_endpoints/ui_sso.py Show resolved Hide resolved

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/proxy/management_endpoints/ui_sso.py Show resolved Hide resolved

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

litellm/proxy/proxy_server.py Show resolved Hide resolved

litellm/proxy/management_endpoints/ui_sso.py Outdated Show resolved Hide resolved

ishaan-jaff added 2 commits March 5, 2026 13:45

fix remaining PKCE test assertion for dict-format verifier storage

c8dcf45

sanitize PKCE cache log to not expose verifier content

cc1f8c2

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

remove dead-code response=None guard in _pkce_token_exchange

17b5b90

vercel bot deployed to Preview March 7, 2026 00:04 View deployment

greptile-apps bot reviewed Mar 7, 2026

View reviewed changes

clean up stale pkce verifier cache entries in non-strict mode

faa8e29

vercel bot deployed to Preview March 7, 2026 00:14 View deployment

greptile-apps bot reviewed Mar 7, 2026

View reviewed changes

fix test: configure async_delete_cache as AsyncMock and assert cleanu…

579078f

…p called

vercel bot deployed to Preview March 7, 2026 00:28 View deployment

greptile-apps bot reviewed Mar 7, 2026

View reviewed changes

add sentinel guard so pkce-no-redis warning only fires once across ho…

7ce09e1

…t-reloads

vercel bot deployed to Preview March 7, 2026 00:42 View deployment

greptile-apps bot reviewed Mar 7, 2026

View reviewed changes

fix misleading comments: code_verifier init and bearer-credential mer…

f541213

…ge docs

vercel bot deployed to Preview March 7, 2026 00:58 View deployment

greptile-apps bot reviewed Mar 7, 2026

View reviewed changes

add best-effort cleanup in strict-mode for corrupt/empty cache entries

48f0e32

vercel bot deployed to Preview March 7, 2026 01:31 View deployment

greptile-apps bot reviewed Mar 7, 2026

View reviewed changes

add redirect_uri assertion, userinfo body in non-200 log, sentinel co…

22e103f

…mment

vercel bot deployed to Preview March 7, 2026 01:43 View deployment

ishaan-jaff merged commit 7697b1c into main Mar 12, 2026
75 of 102 checks passed

ryan-crabbe-berri mentioned this pull request Mar 27, 2026

fix(sso): pass decoded JWT access token to role mapping during SSO login #24701

Merged

2 tasks

		# Initialized here so it's visible in the except block for error-hint logic
		code_verifier: Optional[str] = None

		assert "unavailable" in exc_info.value.message.lower() or "no userinfo" in exc_info.value.message.lower() or "userinfo" in exc_info.value.message.lower()
		assert str(exc_info.value.code) == "401"

Uh oh!

Conversation

ishaan-jaff commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ishaan-jaff commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

ishaan-jaff commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

ishaan-jaff commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ishaan-jaff commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ishaan-jaff commented Mar 5, 2026

Uh oh!

ishaan-jaff commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ishaan-jaff commented Mar 5, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 7, 2026

Uh oh!

greptile-apps bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 7, 2026

Uh oh!

greptile-apps bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 7, 2026

Uh oh!

greptile-apps bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 7, 2026

Uh oh!

greptile-apps bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 7, 2026

Uh oh!

greptile-apps bot Mar 7, 2026

ishaan-jaff commented Mar 5, 2026 •

edited

Loading

vercel bot commented Mar 5, 2026 •

edited

Loading

greptile-apps bot commented Mar 5, 2026 •

edited

Loading