Skip to content

Invalid log order with ElasticsearchTaskHandler #17512

@suhanovv

Description

@suhanovv

Apache Airflow version: 2.*

Apache Airflow Provider versions (please include all providers that are relevant to your bug): providers-elasticsearch==1./2.

Kubernetes version (if you are using kubernetes) (use kubectl version): 1.17

What happened:
When using the ElasticsearchTaskHandler I see the broken order of the logs in the interface.
IMAGE 2021-08-09 18:05:13

raw log entries:
{"asctime": "2021-07-16 09:37:22,278", "filename": "pod_launcher_deprecated.py", "lineno": 131, "levelname": "WARNING", "message": "Pod not yet started: a3ca15e631a7f67697e10b23bae82644.6314229ca0c84d7ba29c870edc39a268", "exc_text": null, "dag_id": "dag_name_AS10_A_A_A_A", "task_id": "k8spod_make_upload_dir_1_2", "execution_date": "2021_07_16T09_10_00_000000", "try_number": "1", "log_id": "dag_name_AS10_A_A_A_A-k8spod_make_upload_dir_1_2-2021_07_16T09_10_00_000000-1", "offset": 1626427744958776832}

{"asctime": "2021-07-16 09:37:26,485", "filename": "taskinstance.py", "lineno": 1191, "levelname": "INFO", "message": "Marking task as SUCCESS. dag_id=pod_gprs_raw_from_nfs_AS10_A_A_A_A, task_id=k8spod_make_upload_dir_1_2, execution_date=20210716T091000, start_date=20210716T092905, end_date=20210716T093726", "exc_text": null, "dag_id": "dag_name_AS10_A_A_A_A", "task_id": "k8spod_make_upload_dir_1_2", "execution_date": "2021_07_16T09_10_00_000000", "try_number": "1", "log_id": "dag_name_AS10_A_A_A_A-k8spod_make_upload_dir_1_2-2021_07_16T09_10_00_000000-1", "offset": 1626427744958776832}

What you expected to happen:

The problem lies in the method set_context that is set on the instance of the class and then all entries in the log go with the same offset, which is used to select for display in the interface. When we redefined method emit and put the offset to the record, the problem disappeared

How to reproduce it:
Run a long-lived task that generates logs, in our case it is a spark task launched from a docker container

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:bugThis is a clearly a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions