-
Notifications
You must be signed in to change notification settings - Fork 173
feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements #2737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements #2737
Conversation
… results for queries
c076812 to
56cd239
Compare
CurtHagenlocher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! We need to mollify the nullability checker, and I've added a few suggestions for fairly superficial changes.
| { | ||
| // The initial response has result data so we don't need to poll | ||
| metadata = DirectResults.ResultSetMetadata; | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } else { | |
| } | |
| else | |
| { |
|
|
||
| if (!string.IsNullOrEmpty(DirectResults.OperationStatus?.DisplayMessage)) | ||
| { | ||
| throw new HiveServer2Exception(DirectResults.OperationStatus.DisplayMessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| throw new HiveServer2Exception(DirectResults.OperationStatus.DisplayMessage) | |
| throw new HiveServer2Exception(DirectResults.OperationStatus!.DisplayMessage) |
I think the null-forgiveness operator is required here because the nullability analysis doesn't know about string.IsNullOrEmpty.
| statement.QueryTimeout = QueryTimeoutSeconds; | ||
| } | ||
|
|
||
| public TSparkDirectResults? DirectResults { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public TSparkDirectResults? DirectResults { get; set; } | |
| protected TSparkDirectResults? DirectResults { get; set; } |
| private bool _applySSPWithQueries = false; | ||
| private bool _enableDirectResults = true; | ||
|
|
||
| internal static TSparkGetDirectResults defaultGetDirectResults = new(){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| internal static TSparkGetDirectResults defaultGetDirectResults = new(){ | |
| internal static TSparkGetDirectResults defaultGetDirectResults = new() | |
| { |
| /// Checks if direct results are available. | ||
| /// </summary> | ||
| /// <returns>True if direct results are available and contain result data, false otherwise.</returns> | ||
| public bool HasDirectResults() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public bool HasDirectResults() | |
| public bool HasDirectResults => DirectResults?.ResultSet != null && DirectResults?.ResultSetMetadata != null; |
(This is a good candidate to be a property instead of a method.)
| this.statement = statement; | ||
| this.schema = schema; | ||
| this.isLz4Compressed = isLz4Compressed; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, maybe start a new direct result reader. make sure doing one thing in one class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not, after a 2nd look at the code
|
@CurtHagenlocher |
Now that #2678 has been checked in there's a merge conflict that will need to be resolved. |
| if (_statement.HasDirectResults && _statement.DirectResults?.ResultSet?.Results?.ResultLinks?.Count > 0) | ||
| { | ||
| // Yield execution so the download queue doesn't get blocked before downloader is started | ||
| await Task.Yield(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this line, there's a deadlock because the _downloadQueue is full, and there's no queue consumer since the downloader hasn't started yet
| return baseHandler; | ||
| } | ||
|
|
||
| protected internal override bool AreResultsAvailableDirectly() => _enableDirectResults; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another good place for a property:
| protected internal override bool AreResultsAvailableDirectly() => _enableDirectResults; | |
| protected internal override bool AreResultsAvailableDirectly => _enableDirectResults; |
Sorry; I missed this in my previous review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is overriding AreResultsAvailableDirectly in HiveServer2Connection/SparkConnection, where it is a function. Should I update the field in both parent classes to be a property as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please!
CurtHagenlocher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again! Sorry; I missed another method that should be a property in my previous review. Looks like we're good after that change.
| get { return _directResults; } | ||
| } | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra blank line
CurtHagenlocher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…t results for statements (apache#2737) - Add option to set EnableDirectResults, which sends getDirectResults in the Thrift execute statement request - If getDirectResults is set in the request, then directResults is set on the response containing initial results (equivalent to the server calling GetOperationStatus, GetResultSetMetadata, FetchResults, and CloseOperation) - If directResults is set on the response, don't poll for the operation status - We already set getDirectResults on requests for metadata commands, just not the execute statement request Tested E2E using `dotnet test --filter CloudFetchE2ETest` ``` [xUnit.net 00:00:00.11] Starting: Apache.Arrow.Adbc.Tests.Drivers.Databricks [xUnit.net 00:01:27.27] Finished: Apache.Arrow.Adbc.Tests.Drivers.Databricks Apache.Arrow.Adbc.Tests.Drivers.Databricks test net8.0 succeeded (87.7s) Test summary: total: 8, failed: 0, succeeded: 8, skipped: 0, duration: 87.7s Build succeeded in 89.1s ```
Tested E2E using
dotnet test --filter CloudFetchE2ETest