Skip to content

Conversation

@toddmeng-db
Copy link
Contributor

@toddmeng-db toddmeng-db commented Apr 30, 2025

Integrates Client Credentials M2M Auth flow with rest of driver.

Layering: DatabricksConnection/SparkConnection -> TProtocal -> TTransport -> OAuthDelegatingHandler -> HttpClient

  • Each DatabricksConnection or SparkConnection instance creates an OAuthClientCredentialsProvider, which manages access token retrieval and expiration.

  • This provider is used by an OAuthDelegatingHandler, a custom HttpMessageHandler that overrides SendAsync to inject a valid OAuth token into each HTTP request header.

  • The OAuthDelegatingHandler is then attached to an HttpClient, which is used by the Thrift TTransport and - TProtocol stack for communication.
    This setup ensures that the Connection's every outgoing Thrift-over-HTTP request includes a valid OAuth token

This should be thread-safe; each connection will have its own OAuthClientCredentialsProvider managing a single non-expired access token. The OAuthClientCredentialsProvider has Sephamore locking logic to prevent race conditions from populating the cache with stale tokens if multiple threads share a connection.

Includes new parameters: requires client_id, client_secret, and auth_flow == 1. If Using OAuth and auth_flow == 1, parses the requires client_id and client_secret parameters.

To test, add to DATABRICKS_TEST_CONFIG_FILE environment variable json

   "uri": "...",
  "auth_type": "OAuth",
  "grant_type": "client_credentials",
  "client_id": "...,
  "client_secret": "dos...",

Follow-up: add tests for token expiration logic

@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch 4 times, most recently from 45f055c to c4a890a Compare May 1, 2025 22:37
@toddmeng-db toddmeng-db changed the title Toddmeng db/client credentials 2 feat(csharp/src/Drivers/Databricks) Integrate OAuthClientCredentialsProvider with Databricks Driver May 1, 2025
@toddmeng-db toddmeng-db changed the title feat(csharp/src/Drivers/Databricks) Integrate OAuthClientCredentialsProvider with Databricks Driver feat(csharp/src/Drivers/Databricks): Integrate OAuthClientCredentialsProvider with Databricks Driver May 1, 2025
@toddmeng-db toddmeng-db marked this pull request as ready for review May 1, 2025 23:04
@github-actions github-actions bot modified the milestone: ADBC Libraries 18 May 1, 2025
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch 3 times, most recently from 0dae349 to 266566b Compare May 1, 2025 23:28
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I left some questions to think about and some requests for small changes.

Properties.TryGetValue(AdbcOptions.Password, out string? password);
Properties.TryGetValue(SparkParameters.AuthType, out string? authType);
Properties.TryGetValue(SparkParameters.AccessToken, out string? access_token);
Properties.TryGetValue(SparkParameters.AuthFlow, out string? authFlow);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a number like "1", is there a descriptive name this could be given e.g. "client_credentials"? Would it make sense to name this "grant_type" instead of "auth_flow"? Is it plausible that OAuth will someday be standardized enough to be implemented by the generic Spark connector or does it make more sense to move these properties to DatabricksParameters?

Please add these to the Markdown documentation for either the Spark properties or the Databricks properties, as appropriate.

Please include the added values in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll move the values to an enum since it's a bit cleaner. To the second point, I'll move it to DatabricksParameter for now, since it seems like not all Spark deployments will support OAuth natively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved grant_type parsing to DatabricksOAuthGrantTypeParser, which handles all parsing logic in Databricks driver.


case SparkAuthType.OAuth:
if (string.IsNullOrWhiteSpace(access_token))
if (authFlow == "1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether "1" or "client_credentials" or some other kind of value, consider either lifting to a constant or creating an enum for this purpose.

using System.Threading.Tasks;
using Apache.Arrow.Adbc.Drivers.Databricks.Auth;

internal class OAuthDelegatingHandler : DelegatingHandler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put into the Apache.Arrow.Adbc.Drivers.Databricks.Auth namespace.

{
public class DatabricksTestConfiguration : SparkTestConfiguration
{

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove blank line

@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch from 6f1a9c3 to f663471 Compare May 5, 2025 17:57
This reverts commit 57265cd.

Delegating handler

AuthFlow parameters

default -

reimpl

Fixes

comment

parse fix

fixes

remove non-async getAccessToken

use string for authflow

trium trailing whitespace

remove sparkparameter validation in databricks connection

parameter handling cleanup

re-add retry handler
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch 2 times, most recently from 46422fb to 3a9a894 Compare May 5, 2025 19:21
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch 3 times, most recently from d5cdc3f to 3e6a057 Compare May 5, 2025 20:31
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch 2 times, most recently from a8425c2 to e467fb3 Compare May 5, 2025 20:40
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch 12 times, most recently from 13efb8e to caaa1e1 Compare May 5, 2025 21:11
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-credentials-2 branch from caaa1e1 to 2012a3f Compare May 5, 2025 21:19
new KeyValuePair<string, string>("grant_type", "client_credentials"),
new KeyValuePair<string, string>("scope", "all-apis")
new KeyValuePair<string, string>("scope", "sql")
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline to use sql auth-scope

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@CurtHagenlocher CurtHagenlocher merged commit f9b26cb into apache:main May 6, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants