Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Nov 25, 2022

This is the more bold fix to common.sql and related providers. It implements more comprehensive tests and describes the intended behaviours in much more explicit way, also it simplifies the notion of "scalar" behaviour by introducing a new method to check on when scalar return is expected.

Comprehensive suite of tests have been added (the original tests were converted to Pytest test and parameterized was used to test all the combinations of parameters and returned values.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Nov 25, 2022

cc: @alexott

@potiuk potiuk force-pushed the fix-sql-behaviours branch from c60d7f4 to 9f43079 Compare November 25, 2022 11:19
@potiuk
Copy link
Member Author

potiuk commented Nov 25, 2022

cc: @alexott @kazanzhy - I have two important doubts about processing of ; (and sql stripping in general).

  1. (Less important if it works) Stripping ";"

I followed original implementations I think and I left I think intact original behviour of stripping in both implementationations.

Look at the parameterized tests - and the "DBApiHook" behaves now differently than Databricks Hook in this regard - the Databricks hook always splits the ;

https://github.com/apache/airflow/pull/27912/files#diff-db2c1495ba766a1bad71474f36caeb3e8b4228522571e3229c1c64cede967027R99

Where the standard DBApiHook does not:

https://github.com/apache/airflow/pull/27912/files#diff-718d368660d07715b383dd4d94344bc41b835189da39bb95fcecaa3a423e6e47R110

  1. The behaviour of cursor and connection opening/closing is different in both cases:
  1. The comiit strategy is different between the two (and does not work in Databricks IMHO) after last statement in Databricks Hook. The DBApi hook performs commit after all statements when autocommit is off https://github.com/apache/airflow/pull/27912/files#diff-6e1b2f961cb951d05d66d2d814ef5f6d8f8bf8f43c40fb5d40e27a031fed8dd7R307 but the DAtabricks one has very weird behaviour - when autocommit is off, I think each statement result will be rolled back automatically - so it basically makes no sense to set autocommit to off, because there is no way to manually commit (COMMIT is a separate statement and previous statement is rolled back when it is executed).

https://peps.python.org/pep-0249/#Connection.close

Note that closing a connection without committing the changes first will cause an implicit rollback to be performed.

I am no sure what was the reasoning behind those differences - but from what I see those differences now between those two "run" methods. Question is - whether we have good reasons to behave differently for those cases?

@potiuk potiuk force-pushed the fix-sql-behaviours branch from 033ea0e to 71c4e13 Compare November 25, 2022 12:43
@potiuk potiuk force-pushed the fix-sql-behaviours branch 3 times, most recently from 25e215b to 6f39bb6 Compare November 25, 2022 16:00
@potiuk
Copy link
Member Author

potiuk commented Nov 25, 2022

OK. There was some permission issue - but I think this one shoudl be ready to go. I added some descriptions and explanations about the behaviours and 🤞 it will pass all the docs and tests

@alexott
Copy link
Contributor

alexott commented Nov 25, 2022

Let me do a checkout in my environment and test.

@potiuk potiuk force-pushed the fix-sql-behaviours branch from 8bf2f52 to 8068404 Compare November 25, 2022 18:48
Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much work on that PR! Nothing to say on my side, very solid! Thanks for adding all these tests (on top of the other changes), it adds a lot of robustness!

@potiuk
Copy link
Member Author

potiuk commented Nov 25, 2022

So much work on that PR! Nothing to say on my side, very solid! Thanks for adding all these tests (on top of the other changes), it adds a lot of robustness!

Indeed. I added already few small changes/clarifications after the initial push and adding all tests and it felt soooooooo reassuring to see green checkmark for all tests after my changes.

@potiuk
Copy link
Member Author

potiuk commented Nov 25, 2022

BTW. I REALLY ❤️ Pytest style of tests vs. the "classic" unit.TestCase.

Partucularrly fixtures - when you get a grasp of them, parameterization and the integration of those running tests within IDEs are just great.

@potiuk
Copy link
Member Author

potiuk commented Nov 25, 2022

All looks good :)

@potiuk potiuk requested review from alexott, kaxil and vincbeck November 26, 2022 00:39
This is the more bold fix to common.sql and related providers.
It implements more comprehensive tests and describes the intended
behaviours in much more explicit way, also it simplifies the notion
of "scalar" behaviour by introducing a new method to check on when
scalar return is expected.

Comprehensive suite of tests have been added (the original tests
were converted to Pytest test and parameterized was used to
test all the combinations of parameters and returned values.
@potiuk potiuk force-pushed the fix-sql-behaviours branch from 8068404 to 7bae5c4 Compare November 26, 2022 00:42
@dstandish
Copy link
Contributor

It's holiday in us and I'm traveling, won't be able to review until next week

@potiuk potiuk merged commit db5375b into apache:main Nov 26, 2022
@potiuk potiuk deleted the fix-sql-behaviours branch November 26, 2022 09:26
Copy link
Contributor

@alexott alexott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you again!

can also modify the returned (and stored in XCom) values in the operators that derive from the
``SQLExecuteQueryOperator``).
* last description of the cursor whether to return scalar values are now stored in DBApiHook
* descriptions of all returned results are stored as descriptions property in the DBApiHook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(could be done later) maybe it would be a bit easier to read if descriptions and last_description are formatted as code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. We can update it later :)

:param sql: the SQL code or string pointing to a template file to be executed (templated).
File must have a '.sql' extensions.
When implementing a specific Operator, you can also implement `_process_output` method in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use double backquotes in docstrings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely :). I think it will work anyway in docstring (we'll see in the API docs) but yeah - we can fix it as follow-up

@potiuk
Copy link
Member Author

potiuk commented Nov 26, 2022

BTW. Release of rc3 is coming .

@BillCM
Copy link

BillCM commented Nov 28, 2022

Interesting note: the 3.3.0 DatabricksSQLOperator had a parameter for control writing to xcom. It appears to be removed in 4.0.0. This is causing the dump of a lot of PII into XCOM for my use case.

@alexott
Copy link
Contributor

alexott commented Nov 28, 2022

@BillCM do_xcom_push should be still supported as it's argument to the BaseOperator - can you check it?

@potiuk potiuk linked an issue Nov 28, 2022 that may be closed by this pull request
2 tasks
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 29, 2022
The change apache#27912 fixed and unified behaviour of DBApiHooks
across the board, but it missed two places where sql was mis-used
and overridden in exasol and snowflake hooks.

The check for "sql" type did not use the original sql parameter
value but the one that was overridden later in the run method
implementation.

The fix is the same as applied in Databricks Hook and DBAPI
generic run methods - using consistent typing and separate
variable to convert the sql string into sql list.

Related: astronomer/astro-sdk#1324
potiuk added a commit that referenced this pull request Nov 30, 2022
…27997)

The change #27912 fixed and unified behaviour of DBApiHooks
across the board, but it missed two places where sql was mis-used
and overridden in exasol and snowflake hooks.

The check for "sql" type did not use the original sql parameter
value but the one that was overridden later in the run method
implementation.

The fix is the same as applied in Databricks Hook and DBAPI
generic run methods - using consistent typing and separate
variable to convert the sql string into sql list.

Related: astronomer/astro-sdk#1324
jrggggg pushed a commit to jrggggg/airflow that referenced this pull request Dec 1, 2022
…pache#27997)

The change apache#27912 fixed and unified behaviour of DBApiHooks
across the board, but it missed two places where sql was mis-used
and overridden in exasol and snowflake hooks.

The check for "sql" type did not use the original sql parameter
value but the one that was overridden later in the run method
implementation.

The fix is the same as applied in Databricks Hook and DBAPI
generic run methods - using consistent typing and separate
variable to convert the sql string into sql list.

Related: astronomer/astro-sdk#1324
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DatabricksSQLOperator ValueError “too many values to unpack”

8 participants