-
Notifications
You must be signed in to change notification settings - Fork 173
feat(csharp/src/Drivers/Databricks): Integrate OAuthClientCredentialsProvider with Databricks Driver #2762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(csharp/src/Drivers/Databricks): Integrate OAuthClientCredentialsProvider with Databricks Driver #2762
Conversation
45f055c to
c4a890a
Compare
0dae349 to
266566b
Compare
CurtHagenlocher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I left some questions to think about and some requests for small changes.
| Properties.TryGetValue(AdbcOptions.Password, out string? password); | ||
| Properties.TryGetValue(SparkParameters.AuthType, out string? authType); | ||
| Properties.TryGetValue(SparkParameters.AccessToken, out string? access_token); | ||
| Properties.TryGetValue(SparkParameters.AuthFlow, out string? authFlow); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a number like "1", is there a descriptive name this could be given e.g. "client_credentials"? Would it make sense to name this "grant_type" instead of "auth_flow"? Is it plausible that OAuth will someday be standardized enough to be implemented by the generic Spark connector or does it make more sense to move these properties to DatabricksParameters?
Please add these to the Markdown documentation for either the Spark properties or the Databricks properties, as appropriate.
Please include the added values in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll move the values to an enum since it's a bit cleaner. To the second point, I'll move it to DatabricksParameter for now, since it seems like not all Spark deployments will support OAuth natively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved grant_type parsing to DatabricksOAuthGrantTypeParser, which handles all parsing logic in Databricks driver.
|
|
||
| case SparkAuthType.OAuth: | ||
| if (string.IsNullOrWhiteSpace(access_token)) | ||
| if (authFlow == "1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether "1" or "client_credentials" or some other kind of value, consider either lifting to a constant or creating an enum for this purpose.
| using System.Threading.Tasks; | ||
| using Apache.Arrow.Adbc.Drivers.Databricks.Auth; | ||
|
|
||
| internal class OAuthDelegatingHandler : DelegatingHandler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put into the Apache.Arrow.Adbc.Drivers.Databricks.Auth namespace.
| { | ||
| public class DatabricksTestConfiguration : SparkTestConfiguration | ||
| { | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove blank line
6f1a9c3 to
f663471
Compare
This reverts commit 57265cd. Delegating handler AuthFlow parameters default - reimpl Fixes comment parse fix fixes remove non-async getAccessToken use string for authflow trium trailing whitespace remove sparkparameter validation in databricks connection parameter handling cleanup re-add retry handler
46422fb to
3a9a894
Compare
d5cdc3f to
3e6a057
Compare
a8425c2 to
e467fb3
Compare
13efb8e to
caaa1e1
Compare
caaa1e1 to
2012a3f
Compare
| new KeyValuePair<string, string>("grant_type", "client_credentials"), | ||
| new KeyValuePair<string, string>("scope", "all-apis") | ||
| new KeyValuePair<string, string>("scope", "sql") | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline to use sql auth-scope
CurtHagenlocher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Integrates Client Credentials M2M Auth flow with rest of driver.
Layering: DatabricksConnection/SparkConnection -> TProtocal -> TTransport -> OAuthDelegatingHandler -> HttpClient
Each
DatabricksConnectionorSparkConnectioninstance creates anOAuthClientCredentialsProvider, which manages access token retrieval and expiration.This provider is used by an
OAuthDelegatingHandler, a customHttpMessageHandlerthat overridesSendAsyncto inject a valid OAuth token into each HTTP request header.The
OAuthDelegatingHandleris then attached to anHttpClient, which is used by the ThriftTTransportand -TProtocolstack for communication.This setup ensures that the Connection's every outgoing Thrift-over-HTTP request includes a valid OAuth token
This should be thread-safe; each connection will have its own
OAuthClientCredentialsProvidermanaging a single non-expired access token. TheOAuthClientCredentialsProviderhas Sephamore locking logic to prevent race conditions from populating the cache with stale tokens if multiple threads share a connection.Includes new parameters: requires client_id, client_secret, and auth_flow == 1. If Using OAuth and auth_flow == 1, parses the requires client_id and client_secret parameters.
To test, add to DATABRICKS_TEST_CONFIG_FILE environment variable json
Follow-up: add tests for token expiration logic