Skip to content

Conversation

@davidhcoe
Copy link
Contributor

@davidhcoe davidhcoe commented Mar 29, 2025

  • Adds support for users to login with their Entra / Azure AD account
  • Adds a retry concept to the driver that will check whether a token needs to be refreshed and then invoke a delegate so an outside caller can perform the token update. Will only go this path if the user has defined a handler for UpdateToken.
  • Includes long running tests to demonstrate the concept:
    image

@davidhcoe davidhcoe changed the title feat(csharp/src/drivers/BigQuery): Add support for AAD/Entra authentication feat(csharp/src/Drivers/BigQuery): Add support for AAD/Entra authentication Apr 8, 2025
@davidhcoe davidhcoe changed the title feat(csharp/src/Drivers/BigQuery): Add support for AAD/Entra authentication feat(csharp/src/Drivers/BigQuery): Add support for AAD/Entra authentication Apr 8, 2025
@davidhcoe
Copy link
Contributor Author

Updated test results after refactoring the tests:

image

@davidhcoe
Copy link
Contributor Author

davidhcoe commented Apr 14, 2025

This can wait until #2698 closes and then I will incorporate those changes in this PR.

@davidhcoe davidhcoe marked this pull request as ready for review April 20, 2025 15:22
@github-actions github-actions bot added this to the ADBC Libraries 18 milestone Apr 20, 2025
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! It sure would be nice to define a semantic contract for this behavior as part of the spec, but if/when that happens we can always change the implementation to conform to it.

My biggest concern around this change is a lack of clarity about concurrency issues. The API clearly incorporates a Task (which implies that concurrency is possible) and the implementations frequently use Tasks (if only because of the Google APIs) but there's a lot of shared state and no clear picture for where concurrency might end up happening or steps taken to guard shared data structures against it.

/// </summary>
/// <param name="ex">The exception that occurs.</param>
/// <returns>True/False indicating a refresh is needed.</returns>
public bool TokenRequiresUpdate(Exception ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is client code expected to use this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this class is serving as both the user-facing API and an internal-facing API. My vague expectation would be that there be one interface which allows the UpdateToken handler to be set and a different interface which generic credential-handling code uses to implement credential logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to be an internal only API, but I don't think it can be split into two different APIs just because RetryManager looks at the ITokenProtectedResource to determine if the token needs to be updated then invokes the delegate for the update to occur.

@CurtHagenlocher
Copy link
Contributor

My biggest concern around this change is a lack of clarity about concurrency issues.

I had been specifically concerned about a token expiring while we were downloading chunks in parallel, but it turns out the connector is not doing that.

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The delay doubling and the lack of retry on CreateReadSession are my two main points of feedback.

}

private StructArray GetTableSchemas(
StructArray GetTableSchemas(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a documented "preferred coding conventions". Much of the other C# code in this overall repo is explicitly using "private", and getting rid of it here has added a bunch of unrelated churn to this PR.


ReadSession rs = new ReadSession { Table = table, DataFormat = DataFormat.Arrow };
ReadSession rrs = readClient.CreateReadSession("projects/" + results.TableReference.ProjectId, rs, maxStreamCount);
ReadSession rrs = clientMgr.ReadClient.CreateReadSession("projects/" + results.TableReference.ProjectId, rs, maxStreamCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a retry? Given that there's a CreateReadSessionAsync I imagine it must be making a network request.

}

await Task.Delay(delay);
delay *= 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should stop doubling at some point. On the tenth iteration, we're waiting nearly 3.5 minutes. If it finishes around thirty seconds into the eleventh wait, then we still have to wait almost three more minutes before checking again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether the new code delay += delay is supposed to be different than delay *= 2. Did you mean delay += initialDelayMilliseconds? Can I suggest an alternative of something like delay = Math.Min(2 * delay, 5000)?

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@CurtHagenlocher CurtHagenlocher merged commit 5aa9f1e into apache:main Apr 23, 2025
1 check passed
colin-rogers-dbt pushed a commit to dbt-labs/arrow-adbc that referenced this pull request Jun 10, 2025
…cation (apache#2655)

- Adds support for users to login with their Entra / Azure AD account
- Adds a retry concept to the driver that will check whether a token
needs to be refreshed and then invoke a delegate so an outside caller
can perform the token update. Will only go this path if the user has
defined a handler for UpdateToken.
- Includes long running tests to demonstrate the concept:

![image](https://github.com/user-attachments/assets/0d633848-052d-417e-994b-10138b5e30a7)

---------

Co-authored-by: David Coe <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants