Skip to content

Conversation

@eric-wang-1990
Copy link
Contributor

@eric-wang-1990 eric-wang-1990 commented Nov 15, 2025

Summary

  • Changed token exchange from non-blocking to blocking synchronous operation
  • Fixed concurrent token exchange handling for multiple different tokens
  • Ensures requests use the exchanged Databricks token instead of the original token that would fail authentication
  • Simplified implementation with proper concurrent request handling

Problem

The previous non-blocking implementation had two issues:

  1. First request would fail: The handler would send the first request with the original non-Databricks token (e.g., Azure AD, AWS IAM) while starting token exchange in the background. Since this is mandatory token exchange, the non-Databricks token would fail authentication with Databricks.

  2. Different tokens not handled correctly: When multiple requests came in with different tokens, the second token would not be exchanged properly if it arrived during the first token's exchange. The second request would wait for the first token's exchange, then return without starting its own exchange.

Solution

  • Token exchange now blocks the request until completion using direct await
  • Serialize token exchanges: When a request with a different token arrives during an ongoing exchange, it waits for that exchange to complete, then starts its own exchange
  • Removed Task.Run background thread offloading - exchange happens on the calling thread
  • Removed _pendingTokenTask tracking complexity
  • Removed Dispose method (no longer needed without background tasks)
  • Removed _lastSeenToken check from NeedsTokenExchange to avoid race conditions
  • Falls back to original token if exchange fails, with improved error message
  • Single-token cache prevents unbounded memory growth (only most recent token is cached)

Implementation Details

The key insight is to serialize all token exchanges:

// 1. Wait for any ongoing exchange to complete (could be for a different token)
if (_pendingExchange != null)
{
    await _pendingExchange;
}

// 2. Check if our token was already processed
if (_lastSeenToken == bearerToken)
{
    return;  // Already exchanged
}

// 3. Start exchange for our token
_pendingExchange = DoExchangeAsync(bearerToken, cancellationToken);
await _pendingExchange;

This ensures:

  • Only one exchange runs at a time
  • Each different token gets exchanged
  • Concurrent requests with the same token share the exchange
  • No race conditions

Test plan

  • Test with external token to verify exchange blocks until completion before first request
  • Verify subsequent requests reuse the cached exchanged token
  • Test with different external tokens to verify each is exchanged properly
  • Test concurrent requests with same token verify only one exchange happens
  • Test token exchange failure scenario to confirm fallback behavior
  • All 11 unit tests pass

🤖 Generated with Claude Code

eric-wang-1990 and others added 2 commits November 15, 2025 00:59
…cking

Changed the token exchange from a non-blocking background operation to a blocking synchronous operation. This ensures that requests use the exchanged token rather than the original non-Databricks token, which would fail authentication.

Key changes:
- Removed Task.Run wrapper and background thread offloading
- Token exchange now blocks the request until completion
- Simplified code by removing _pendingTokenTask tracking and Dispose method
- Falls back to original token if exchange fails with clearer error message

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…dling

Fixed issues with the blocking token exchange implementation:
- Handle concurrent requests with different tokens properly
- Serialize token exchanges to prevent race conditions
- Track which token is currently being exchanged with _tokenBeingExchanged
- When a different token arrives during an exchange, wait for completion then start new exchange
- Maintain single-token cache to prevent unbounded memory growth

Key changes:
- Added _tokenBeingExchanged field to track ongoing exchanges
- Updated PerformTokenExchangeIfNeeded to serialize exchanges for different tokens
- Removed _lastSeenToken check from NeedsTokenExchange to avoid race conditions
- Updated test to verify all concurrent requests use exchanged token
- Added System.Linq using for LINQ operations in tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The field was set but never read for decision-making. The logic works correctly
without it since we serialize exchanges using _pendingExchange and check
_lastSeenToken to determine if a token was already processed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copy link
Contributor

@alexguo-db alexguo-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. The extend lifetime token exchange can be non-blocking, because the original token is a valid OAuth token. The mandatory token exchange can be non-blocking for AAD tokens issued from Azure Databricks since the original token is valid, but it must be blocking for workload identity federation because the original token is only used for token exchange (and can't access Databricks resources directly).

@jadewang-db
Copy link
Contributor

Discussed offline. The extend lifetime token exchange can be non-blocking, because the original token is a valid OAuth token. The mandatory token exchange can be non-blocking for AAD tokens issued from Azure Databricks since the original token is valid, but it must be blocking for workload identity federation because the original token is only used for token exchange (and can't access Databricks resources directly).

we should make the extend lifetime token exchange non blocking to ensure we are not increasing the query time. also do we have token cache for SSO? we should avoid calling the endpoint for every query/requests.

@eric-wang-1990
Copy link
Contributor Author

Discussed offline. The extend lifetime token exchange can be non-blocking, because the original token is a valid OAuth token. The mandatory token exchange can be non-blocking for AAD tokens issued from Azure Databricks since the original token is valid, but it must be blocking for workload identity federation because the original token is only used for token exchange (and can't access Databricks resources directly).

we should make the extend lifetime token exchange non blocking to ensure we are not increasing the query time. also do we have token cache for SSO? we should avoid calling the endpoint for every query/requests.

We are caching the token. It is not calling the endpoint for every request, for a single httpConnection handler it will only do it for once.

@CurtHagenlocher CurtHagenlocher merged commit 9a1df76 into apache:main Nov 17, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants