-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow version
main (development)
What happened
The common sql fetch_all_handler modified by #19313 uses cursor.returns_rows which is not dbapi2 compliant but sqlalchemy specific.
The DBApiHook.run method does not use sqlalchemy since it uses get_conn and most (if not all) drivers cursors do not have the returns_rows attribute (ex: pymssql and jaydebeapi do not have it: see below).
Why did the tests in https://github.com/apache/airflow/actions/runs/2712108854 passed:
- the jdbc's operator test assert that the run method is called with
fetch_all_handlerwhich is not tested. - the databricks's hook test cursor mock has the
returns_rowsattribute. - in all tests, the mock used will make
hasattr(cursor, "returns_rows")resolves toTruewhich is why tests passed
jaydebeapi's cursor attributes:
>>> dir(cur)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_close_last', '_connection', '_converters', '_description', '_meta', '_prep', '_rs', '_set_stmt_parms', 'arraysize', 'close', 'description', 'execute', 'executemany', 'fetchall', 'fetchmany', 'fetchone', 'rowcount', 'setinputsizes', 'setoutputsize']pymssql's cursor attributes:
>>> dir(cur)
['__class__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__pyx_vtable__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '_source', 'callproc', 'close', 'connection', 'description', 'execute', 'executemany', 'fetchall', 'fetchmany', 'fetchone', 'lastrowid', 'nextset', 'returnvalue', 'rowcount', 'rownumber', 'setinputsizes', 'setoutputsize']What you think should happen instead
For the fetch_all_handler common handler we should be using cursor.description is not None instead.
#25412 also suggested a rollback so we should take this into account.
How to reproduce
Import a sql hook (any sql hook inheriting DbApiHook.run should do):
>>> from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
>>> hk = MsSqlHook("localhost-mssql-docker-test")
>>> hk.run("SELECT SUSER_SNAME();", handler=lambda c: c.fetchall())
[('sa',)]Import the new fetch_all_handler and get an error
>>> from airflow.providers.common.sql.hooks.sql import fetch_all_handler
>>> hk.run("SELECT SUSER_SNAME();", handler=fetch_all_handler)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/gebo/.virtualenvs/airflow2.1/lib/python3.6/site-packages/airflow/hooks/dbapi.py", line 207, in run
result = handler(cur)
File "<stdin>", line 2, in fetch_all_handler
AttributeError: 'pymssql._pymssql.Cursor' object has no attribute 'returns_rows'Operating System
Ubuntu 20.04
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
The new version would look like:
def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
"""Handler for DbApiHook.run() to return results"""
# dbapi2 compliant way to check for cursor results:
if cursor.description is not None:
return cursor.fetchall()
else:
return NoneAre you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct