-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Closed
Labels
affected_version:2.10Issues Reported for 2.10Issues Reported for 2.10area:UIRelated to UI/UX. For Frontend Developers.Related to UI/UX. For Frontend Developers.area:corekind:bugThis is a clearly a bugThis is a clearly a bug
Description
Apache Airflow version
2.10.0rc1
If "Other Airflow 2 version" selected, which one?
No response
What happened?
get_tree_view in degenerated case can take a lot of memory.
For a DAG
with DAG("aaa_big_get_tree_view", schedule=None) as dag:
first_set = [LongEmptyOperator(task_id=f"hello_{i}_{'a' * 230}") for i in range(900)]
chain(*first_set)
last_task_in_first_set = first_set[-1]
chain(
last_task_in_first_set, [LongEmptyOperator(task_id=f"world_{i}_{'a' * 230}") for i in range(900)]
)
chain(
last_task_in_first_set, [LongEmptyOperator(task_id=f"this_{i}_{'a' * 230}") for i in range(900)]
)
chain(last_task_in_first_set, [LongEmptyOperator(task_id=f"is_{i}_{'a' * 230}") for i in range(900)])
chain(
last_task_in_first_set, [LongEmptyOperator(task_id=f"silly_{i}_{'a' * 230}") for i in range(900)]
)
chain(
last_task_in_first_set, [LongEmptyOperator(task_id=f"stuff_{i}_{'a' * 230}") for i in range(900)]
)
serializing it can take 2.7GB
root@a24bae3584cb:/opt/airflow# pytest --memray tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag
=========================================================================================================================================================================== test session starts ============================================================================================================================================================================
platform linux -- Python 3.12.5, pytest-8.3.2, pluggy-1.5.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow
configfile: pyproject.toml
plugins: memray-1.7.0, timeouts-1.2.1, icdiff-0.9, mock-3.14.0, rerunfailures-14.0, requests-mock-1.12.1, xdist-3.6.1, asyncio-0.23.8, anyio-4.4.0, instafail-0.5.0, cov-5.0.0, time-machine-2.15.0, custom-exit-code-0.3.0
asyncio: mode=Mode.STRICT
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 1 item
tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag PASSED [100%]
============================================================================================================================================================================== MEMRAY REPORT ===============================================================================================================================================================================
Allocation results for tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag at the high watermark
📦 Total memory allocated: 5.4GiB
📏 Total allocations: 23
📊 Histogram of allocation sizes: |▁▁█ |
🥇 Biggest allocating functions:
- _safe_get_dag_tree_view:/opt/airflow/airflow/providers/openlineage/utils/utils.py:446 -> 2.7GiB
- get_tree_view:/opt/airflow/airflow/models/dag.py:2445 -> 2.7GiB
- __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191 -> 1.3MiB
- __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191 -> 1.3MiB
- __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191 -> 1.3MiB
=================================================================================================================================================================== Warning summary. Total: 3, Unique: 3 ===================================================================================================================================================================
airflow: total 1, unique 1
collect: total 1, unique 1
other: total 2, unique 2
collect: total 2, unique 2
Warnings saved into /opt/airflow/tests/warnings.txt file.
============================================================================================================================================================================ 1 passed in 8.60s =============================================================================================================================================================================
What you think should happen instead?
I think tree_view format should be changed to one that does not require extraordinary amount of whitespace in deeply nested cases.
Would be good to know in which cases it's being used though.
How to reproduce
You can use above dag.
Operating System
Docker/breeze on MacOS
Versions of Apache Airflow Providers
No response
Deployment
Other
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
affected_version:2.10Issues Reported for 2.10Issues Reported for 2.10area:UIRelated to UI/UX. For Frontend Developers.Related to UI/UX. For Frontend Developers.area:corekind:bugThis is a clearly a bugThis is a clearly a bug