Skip to content

Comments

Use DbApiHook.run for DbApiHook.get_records and DbApiHook.get_first#26944

Merged
potiuk merged 1 commit intoapache:mainfrom
kazanzhy:use_DbApiHook.run_for_execute
Oct 31, 2022
Merged

Use DbApiHook.run for DbApiHook.get_records and DbApiHook.get_first#26944
potiuk merged 1 commit intoapache:mainfrom
kazanzhy:use_DbApiHook.run_for_execute

Conversation

@kazanzhy
Copy link
Contributor

@kazanzhy kazanzhy commented Oct 7, 2022

Currently DbApiHook.get_records and DbApiHook.get_first invokes cursor.execute method.
This PR is a step to possibly use a single method DbApiHook.run for query execution.

closes: #9957

@kazanzhy kazanzhy force-pushed the use_DbApiHook.run_for_execute branch 3 times, most recently from f020eda to a1a3296 Compare October 10, 2022 15:52
@kazanzhy kazanzhy marked this pull request as ready for review October 10, 2022 21:48
@kazanzhy kazanzhy requested a review from eladkal as a code owner October 10, 2022 21:48
@kazanzhy kazanzhy force-pushed the use_DbApiHook.run_for_execute branch 2 times, most recently from 196fbe7 to b07bff8 Compare October 22, 2022 21:03
@kazanzhy kazanzhy force-pushed the use_DbApiHook.run_for_execute branch 2 times, most recently from 22e1744 to 81cb45f Compare October 25, 2022 22:25
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Any | list[Any] == Any ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should also remove the # type: ignore[override] in subcleases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed back from sql: str = "" to sql: str | list[str] = "".
It seems strange but without it, I can't remove # type: ignore[override]

@kazanzhy kazanzhy force-pushed the use_DbApiHook.run_for_execute branch from 81cb45f to bb0b5bf Compare October 27, 2022 22:34
@kazanzhy kazanzhy force-pushed the use_DbApiHook.run_for_execute branch from bb0b5bf to 4acb783 Compare October 29, 2022 18:45
@shlomiken
Copy link

shlomiken commented Dec 6, 2023

@potiuk - looks like this PR breaks get_first and get_records return type for snowflake for example , so instead of tuple we get a Dict.
this also breaks SqlSensor as it expects this first_cell = records[0][0]

this code - was getting the default cursor from the connection (SnowFlakeCursor) - but now is in the hands of the specific hook , which in snowflake case return a DictCursor

 with closing(conn.cursor()) as cur:
                results = []
                for sql_statement in sql:
                    self._run_command(cur, sql_statement, parameters)

                    if handler is not None:
                        result = handler(cur)
                        results.append(result)

@potiuk
Copy link
Member

potiuk commented Dec 6, 2023

Use latest version of both Airflow and Snowflake provider. There were several fixes implemented after that one.

Just make sure to upgrade everything to latest versions as they contain latest fixes.

@potiuk
Copy link
Member

potiuk commented Dec 6, 2023

And make sure to fix any deprecation warnings you see and look at the release notes - of Snowflake provider. There were a few breaking changes that you might want to fix (some of the things could be one of them).

@shlomiken
Copy link

@potiuk - thanks , i kinda figure that out after trying to play with different versions of common-sql and provider.
what surprised me is that we used to get a tuple, then after upgrading some of the providers we got dict.
and now latest version basically support both if a hook parameter is passed for returnning dictionary - meaning default is back to tuple. - what a mess:-)

@potiuk
Copy link
Member

potiuk commented Dec 6, 2023

back to tuple. - what a mess:-)

No. It is actually common right now. We use to have mess in the PAST . Now it's all straightforward. And the "other" options (like Dictionary) are only for those who want to have backwards compatibility. Mess is mainly because user want to stick to old ways even if we make things better organized and prefer to stay with the messy way it was before.

@potiuk
Copy link
Member

potiuk commented Dec 6, 2023

So if you want to avoid the mess - convert your code to use tuples and you will be good and everything will be straightend u for you (as is for users who are starting now and everything is the same for all DBAPi Hooks)

Also if you want to know why - you can read more of the architectural decisions made here: #36015

They explain how we straightened up the mess.

@shlomiken
Copy link

@potiuk - i think i came out wrong - i really appreciate what have being done (sticking with standard DBAPI)
the thing is meant is - we used to have tuple, then upgraded and got dict , now we understand tuple is the desired.

anyway - i'm trying to use the latest and greatest , in addition i'm using astronomer async operators - but now
i get this weird error which i cant understand how its possible .

super().__init__(conn_id=snowflake_conn_id, **kwargs)
E       TypeError: airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator.__init__() got multiple values for keyword argument 'conn_id'

@potiuk
Copy link
Member

potiuk commented Dec 6, 2023

Not suure - might be problem with the async operators

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dbapi_hook get_records doesn't accept list.

4 participants