Allow execution of multiple sql statements in SnowflakeHook #11350

JavierLopezT · 2020-10-08T13:50:45Z

Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for allowing to pass files or strings with several queries, a new method is included in the hook

Does this require tests?

github-actions · 2020-10-08T14:24:45Z

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

airflow/providers/snowflake/hooks/snowflake.py

kaxil · 2020-11-04T00:29:58Z

Can you please rebase your PR on latest Master since we have applied Black and PyUpgrade on Master.

It will help if your squash your commits into single commit first so that there are less conflicts.

github-actions · 2020-11-26T13:31:04Z

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

kaxil · 2020-12-01T22:41:32Z

airflow/providers/snowflake/hooks/snowflake.py

+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:
+                    self.set_autocommit(conn, autocommit)
+
+                    conn.execute_string(sql, parameters)
+        else:
+            super().run(sql, autocommit, parameters)


Can you add a test for this @JavierLopezT

Hi @kaxil . I think I have to test that depending on whether sql is a list or a string, it calls conn.execute_string or super().run, but I have no idea how to do that. I have started defining two functions and two different SQL, but I can't go beyond. Could you help me, please? Thank you very much in advance

Hey @JavierLopezT -- Have a look at https://github.com/apache/airflow/blob/master/tests/providers/exasol/hooks/test_exasol.py for some examples.

Let me know if you need help though

@JavierLopezT what you want to do is use mock.patch to verify that in the run method, the right underlying method is called given the right input. google for basic examples of mock patch.

in this case you would want to mock the get_conn method so that its return_value is a mock object. then after run finishes, you can verify that the mock object was called with method execute_string and your value for sql

there are lots of examples in the repo just try to find something similar

github-actions · 2020-12-18T13:27:28Z

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

dstandish

this is a worthwhile change @JavierLopezT 👍

airflow/providers/snowflake/hooks/snowflake.py

dstandish · 2020-12-30T08:21:00Z

airflow/providers/snowflake/hooks/snowflake.py

+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:
+                    self.set_autocommit(conn, autocommit)
+
+                    conn.execute_string(sql, parameters)
+        else:
+            super().run(sql, autocommit, parameters)


@JavierLopezT what you want to do is use mock.patch to verify that in the run method, the right underlying method is called given the right input. google for basic examples of mock patch.

in this case you would want to mock the get_conn method so that its return_value is a mock object. then after run finishes, you can verify that the mock object was called with method execute_string and your value for sql

there are lots of examples in the repo just try to find something similar

airflow/providers/snowflake/hooks/snowflake.py

JavierLopezT · 2021-02-24T14:36:42Z

@potiuk Hello! Is it worth it to keep with this MR or shall we wait for a new version of snowflake python connector?

potiuk · 2021-02-24T14:56:50Z

@potiuk Hello! Is it worth it to keep with this MR or shall we wait for a new version of snowflake python connector?

Certainly. I am not owning the connector :). Not sure when/if the new version comes out. And when it does - i think it will be backwards compatible, so there is no reason why this should be blocking anyone.

dstandish · 2021-02-24T16:29:46Z

airflow/providers/snowflake/hooks/snowflake.py

+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)


the problem with this @JavierLopezT is it will reconnect for every statement.

this can be a nightmare when you use okta :) (though it's possible to cache the creds)

but more importantly, the way you've implemented, it's not possible to use temp tables (which is pretty important). after reconnect the table will be gone.

instead you should connect only once and use same connection for the whole series of statements.

e.g.

with closing(hook.get_conn()) as cnx: cur = cnx.cursor() for query in queries: cur.execute(query) for row in cur.fetchall(): print(row)

(you want to print fetchall because this is how you get statement results, and operator is not meant to be used for a bare "select" statement so no worry about writing a billion rows to the log :) )

Thanks for the example. However, I don't get why would we want to print fetchall

After you execute a statement snowflake returns information
for example it might say rows affected, or after a copy statement, the result of the load for all files
Log should print it
i also recomment using DictCursor if you're gonna print out (it's a snowflake class)

github-actions · 2021-02-24T18:50:27Z

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.

dstandish · 2021-02-24T20:37:37Z

airflow/providers/snowflake/hooks/snowflake.py

+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,


@JavierLopezT does this actually support running a file? It looks like it must be string here.

I think we def should make it so sql can be path or sql (i.e. Union[Path,str], and if str, check if it's a path to a file that exists), though it doesn't have to be this PR -- I just want to suggest here you make sure the docstring is consistent here with the behavior.

Sorry missed this before.

and small nit, that relies on run from DBApiHook no longer true

github-actions · 2021-02-24T21:08:45Z

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

github-actions · 2021-02-24T21:09:29Z

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

airflow/providers/snowflake/hooks/snowflake.py

Co-authored-by: Kaxil Naik <[email protected]>

github-actions · 2021-02-25T10:12:39Z

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

kaxil · 2021-02-25T12:12:45Z

airflow/providers/snowflake/hooks/snowflake.py

 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from contextlib import closing


Static check fails with:

airflow/providers/snowflake/hooks/snowflake.py:18:1: F401 'contextlib.closing' imported but unused

JavierLopezT · 2021-04-01T08:36:34Z

Closing it to open a new open with more features as suggestions here

dstandish · 2021-04-01T16:52:35Z

Any reason why this one needs to be abandoned?

I can imagine circumstances where you just want to start over and go in a different direction.

I'm not sure what the conventions are and maybe it doesn't matter.

But I think in general it's good to have the history for a PR contained in one place. Anyway just a thought

eladkal reviewed Oct 8, 2020

View reviewed changes

airflow/providers/snowflake/hooks/snowflake.py Outdated Show resolved Hide resolved

JavierLTPromofarma added 3 commits November 25, 2020 17:52

Allow execution of multiple sql statements in SnowflakeHook

8e2e4b3

impersonal

0baf930

execute_string

23f85b4

JavierLTPromofarma force-pushed the snowflake_hook branch from 65396aa to 23f85b4 Compare November 25, 2020 16:52

JavierLTPromofarma added 2 commits November 26, 2020 14:07

fixes

7ffc0f7

fix

39d114e

fix

b8103c0

kaxil reviewed Dec 1, 2020

View reviewed changes

starting tests

50b2d2e

dstandish requested changes Dec 30, 2020

View reviewed changes

JavierLTPromofarma added 2 commits February 24, 2021 15:03

changes

1b5fcbb

conflicts

52d3103

potiuk approved these changes Feb 24, 2021

View reviewed changes

dstandish requested changes Feb 24, 2021

View reviewed changes

JavierLopezT and others added 3 commits February 24, 2021 17:33

Merge branch 'master' into snowflake_hook

8a61071

dstandish suggestion

e1c66f3

Merge remote-tracking branch 'origin/snowflake_hook' into snowflake_hook

b2bed18

github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Feb 24, 2021

kaxil approved these changes Feb 24, 2021

View reviewed changes

dstandish reviewed Feb 24, 2021

View reviewed changes

turbaszek added the provider:snowflake Issues related to Snowflake provider label Feb 24, 2021

kaxil reviewed Feb 24, 2021

View reviewed changes

airflow/providers/snowflake/hooks/snowflake.py Show resolved Hide resolved

Update airflow/providers/snowflake/hooks/snowflake.py

c2cee5b

Co-authored-by: Kaxil Naik <[email protected]>

kaxil reviewed Feb 25, 2021

View reviewed changes

JavierLopezT closed this Apr 1, 2021

JavierLopezT deleted the snowflake_hook branch April 1, 2021 08:43

Allow execution of multiple sql statements in SnowflakeHook #11350

Allow execution of multiple sql statements in SnowflakeHook #11350

Uh oh!

Conversation

JavierLopezT commented Oct 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 8, 2020

Uh oh!

Uh oh!

kaxil commented Nov 4, 2020

Uh oh!

github-actions bot commented Nov 26, 2020

Uh oh!

kaxil Dec 1, 2020

Choose a reason for hiding this comment

Uh oh!

JavierLopezT Dec 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaxil Dec 21, 2020

Choose a reason for hiding this comment

Uh oh!

dstandish Dec 30, 2020

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 18, 2020

Uh oh!

dstandish left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dstandish Dec 30, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JavierLopezT commented Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

potiuk commented Feb 24, 2021

Uh oh!

dstandish Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JavierLopezT Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dstandish Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 24, 2021

Uh oh!

dstandish Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dstandish Feb 24, 2021

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 24, 2021

Uh oh!

github-actions bot commented Feb 24, 2021

Uh oh!

Uh oh!

github-actions bot commented Feb 25, 2021

Uh oh!

kaxil Feb 25, 2021

Choose a reason for hiding this comment

Uh oh!

JavierLopezT commented Apr 1, 2021

Uh oh!

dstandish commented Apr 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

JavierLopezT commented Oct 8, 2020 •

edited

Loading

JavierLopezT Dec 18, 2020 •

edited

Loading

JavierLopezT commented Feb 24, 2021 •

edited

Loading

dstandish Feb 24, 2021 •

edited

Loading

JavierLopezT Feb 24, 2021 •

edited

Loading

dstandish Feb 24, 2021 •

edited

Loading

dstandish Feb 24, 2021 •

edited

Loading

dstandish commented Apr 1, 2021 •

edited

Loading