ARROW-15452: [FlightRPC][Java] JDBC driver for Flight SQL#12254
ARROW-15452: [FlightRPC][Java] JDBC driver for Flight SQL#12254rafael-telles wants to merge 1499 commits intoapache:mainfrom
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename pull request title in the following format? or See also: |
|
|
lidavidm
left a comment
There was a problem hiding this comment.
Thanks. I've done an initial pass but haven't looked at things too deeply.
Does having benchmarks make sense? (Maybe for particular components like accessor/transformer?)
java/pom.xml
Outdated
| <artifactId>spotbugs-maven-plugin</artifactId> | ||
| <version>4.2.3</version> | ||
| </plugin> | ||
|
|
There was a problem hiding this comment.
Is it possible to keep these plugins/dependencies in the driver's pom.xml? (Though it would be nice to enable static analyzers project-wide, that should be done separately.)
| @@ -0,0 +1,295 @@ | |||
| /* | |||
There was a problem hiding this comment.
I'm going to assume any changes under flight-sql are from the other PRs…let me know if that's not the case.
There was a problem hiding this comment.
It's from the ColumnMetadata PR, we needed it to continue implementing some missing stuff like proper ResultSetMetaData for PreparedStatement metadata, and other metadata for ResultSet#getColumns.
| @@ -0,0 +1,29 @@ | |||
| -----BEGIN CERTIFICATE----- | |||
There was a problem hiding this comment.
We have certificates in /testing/data/flight, could those be used? (I guess we need to make a keystore for them.)
There was a problem hiding this comment.
We could use them, we just need to get it in flight-jdbc-driver, and see how we can make a keystore for them.
| return new ArrowFlightJdbcNullVectorAccessor(setCursorWasNull); | ||
| } | ||
|
|
||
| throw new UnsupportedOperationException(); |
There was a problem hiding this comment.
Maybe include the type in the exception message?
There was a problem hiding this comment.
This doesn't seem to have changed
| } else if (vector instanceof DenseUnionVector) { | ||
| return new ArrowFlightJdbcDenseUnionVectorAccessor((DenseUnionVector) vector, getCurrentRow, | ||
| setCursorWasNull); | ||
| } else if (vector instanceof NullVector || vector == null) { |
There was a problem hiding this comment.
It could occur if the Flight SQL Server Endpoint defines a column with it. It could represent java.sql.Types.NULL.
There was a problem hiding this comment.
I mean specifically null, not NullVector, sorry
| /** | ||
| * Auxiliary wrapper class for {@link Connection}, used on {@link ArrowFlightJdbcPooledConnection}. | ||
| */ | ||
| public class ConnectionWrapper implements Connection { |
There was a problem hiding this comment.
Since this appears to be a pure proxy, it might be worth noting that fact (aka this is purely to serve as a base class and factor out the boilerplate elsewhere)
There was a problem hiding this comment.
Do you want us to document that this is to remove boilerplate?
| * | ||
| * @return the exception. | ||
| */ | ||
| public static UnsupportedOperationException getOperationNotSupported(final Class<?> type) { |
There was a problem hiding this comment.
It seems this is used purely in the base Accessor class, maybe it could be a private static method of that class?
| * @deprecated not updatable to match dinamic server allocation. | ||
| */ | ||
| @Deprecated | ||
| public enum UrlSample { |
There was a problem hiding this comment.
Why are we adding a new deprecated class?
There was a problem hiding this comment.
We're cleaning up deprecated classes/methods.
| * @see FlightServerTestRule | ||
| * @deprecated not updatable to match dinamic server allocation. | ||
| */ | ||
| @Deprecated |
There was a problem hiding this comment.
Ditto here, why add a class only to immediately deprecate it? (I guess things are in the middle of a refactoring here?)
There was a problem hiding this comment.
Yes! This is still work in progress and we will have to clean up
| try { | ||
| queue.next(); | ||
| } catch (final IllegalStateException e) { | ||
| expectedExceptionOnNextIfClosed = Optional.of(e); |
There was a problem hiding this comment.
It seems other parts of the codebase use assertThrows for this type of check
|
Thank you so much @lidavidm ! We will be working on your feedback soon |
|
I just wanted to echo my comment from the mailing list that I think it would be best if we held a code donation vote and IP clearance for this piece of work since it is substantial and was developed outside of the Arrow community. I don't expect there will be any issues but it is good due diligence. As a reminder we welcome the development of large new projects in feature branches, where the development process otherwise is conducted according to the Apache Way (public discussions, issues, PRs, etc.) I believe this will involve a Software Grant from Dremio (correct me if I'm wrong) for the work and ICLAs for the individual contributors. |
|
Thanks for chiming in Wes - I'll look into the process and follow up on the mailing list @rafael-telles. |
|
BTW, please let us know when you think this is ready for a full review |
| * @return a {@code FlightStream} of results. | ||
| */ | ||
| public List<FlightStream> getStreams(final FlightInfo flightInfo) { | ||
| return flightInfo.getEndpoints().stream() |
There was a problem hiding this comment.
This isn't correct. This will try to get every ticket using the current client. However each endpoint is a pair <Ticket, List>.
What we actually need to do is for each endpoint, create a FlightClient at each location and call getStream() with the associated ticket. See:
Note that if the list of locations is empty, it means the data is only available at the current location:
Can optimize this to some extent if the current location is used once, use the current client, though that limits the driver to only running one Statement at a time.
We should consider pooling FlightClients. I think Apache Pools is a good choice here, and use KeyedObjectPool to model the problem: https://commons.apache.org/proper/commons-pool/apidocs/org/apache/commons/pool2/KeyedObjectPool.html
|
This can be rebased against the accepted proposals now. |
…tor the code to use it
…d refactor tests to use it
…ods to use this interface
Implement toString method for JdbcArray class
Fix struct type on jdbc
* Add toString to Time obj in Time#toString * Improve Time toString * Fix maven plugins * Revert "Update java/flight/flight-jdbc-driver/src/test/java/org/apache/arrow/driver/jdbc/accessor/impl/calendar/ArrowFlightJdbcTimeStampVectorAccessorTest.java" This reverts commit 00808c0. * Revert "Merge pull request #29 from rafael-telles/Timestamp_fix" This reverts commit 7924e7b, reversing changes made to f6ac593. * Fix DateTime for negative epoch * Remove unwanted change * Fix negative timestamp shift * Fix coverage * Refator DateTimeUtilsTest
…ormat Change the behavior to return the correct format for day and year intervals.
Fix query exception message
…etadata-type-name [Java] [JDBC] Flight jdbc driver create metadata type name.
|
@rafael-telles It seems this needs rebasing on git master and then fixing conflicts, would you like to do that? |
Fix attributes as headers bug and token value parsing error
Implement case insensitive for the property keys
|
We should probably close this PR in favor of a new PR based on the cleared code |
|
Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍 |
This implements a JDBC driver able to communicate to Flight SQL sources.
So far this covers:
Yet to be done: