Skip to content

common sql fetch_all_handler bug #25429

@FanatoniQ

Description

@FanatoniQ

Apache Airflow version

main (development)

What happened

The common sql fetch_all_handler modified by #19313 uses cursor.returns_rows which is not dbapi2 compliant but sqlalchemy specific.

The DBApiHook.run method does not use sqlalchemy since it uses get_conn and most (if not all) drivers cursors do not have the returns_rows attribute (ex: pymssql and jaydebeapi do not have it: see below).

Why did the tests in https://github.com/apache/airflow/actions/runs/2712108854 passed:

  • the jdbc's operator test assert that the run method is called with fetch_all_handler which is not tested.
  • the databricks's hook test cursor mock has the returns_rows attribute.
  • in all tests, the mock used will make hasattr(cursor, "returns_rows") resolves to True which is why tests passed

jaydebeapi's cursor attributes:

>>> dir(cur)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_close_last', '_connection', '_converters', '_description', '_meta', '_prep', '_rs', '_set_stmt_parms', 'arraysize', 'close', 'description', 'execute', 'executemany', 'fetchall', 'fetchmany', 'fetchone', 'rowcount', 'setinputsizes', 'setoutputsize']

pymssql's cursor attributes:

>>> dir(cur)
['__class__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__pyx_vtable__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '_source', 'callproc', 'close', 'connection', 'description', 'execute', 'executemany', 'fetchall', 'fetchmany', 'fetchone', 'lastrowid', 'nextset', 'returnvalue', 'rowcount', 'rownumber', 'setinputsizes', 'setoutputsize']

What you think should happen instead

For the fetch_all_handler common handler we should be using cursor.description is not None instead.
#25412 also suggested a rollback so we should take this into account.

Ref: https://peps.python.org/pep-0249/#description

How to reproduce

Import a sql hook (any sql hook inheriting DbApiHook.run should do):

>>> from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
>>> hk = MsSqlHook("localhost-mssql-docker-test")
>>> hk.run("SELECT SUSER_SNAME();", handler=lambda c: c.fetchall())
[('sa',)]

Import the new fetch_all_handler and get an error

>>> from airflow.providers.common.sql.hooks.sql import fetch_all_handler
>>> hk.run("SELECT SUSER_SNAME();", handler=fetch_all_handler)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/gebo/.virtualenvs/airflow2.1/lib/python3.6/site-packages/airflow/hooks/dbapi.py", line 207, in run
    result = handler(cur)
  File "<stdin>", line 2, in fetch_all_handler
AttributeError: 'pymssql._pymssql.Cursor' object has no attribute 'returns_rows'

Operating System

Ubuntu 20.04

Versions of Apache Airflow Providers

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else

The new version would look like:

def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
    """Handler for DbApiHook.run() to return results"""
    # dbapi2 compliant way to check for cursor results:
    if cursor.description is not None:
        return cursor.fetchall()
    else:
        return None

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions