-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Closed
Labels
area:Schedulerincluding HA (high availability) schedulerincluding HA (high availability) schedulerkind:bugThis is a clearly a bugThis is a clearly a bugtelemetryTelemetry-related issuesTelemetry-related issues
Description
What
The test sometimes fails.
For example: https://github.com/apache/airflow/actions/runs/21358582533/job/61472786065?pr=60989
Traceback Logs
summary:
FAILED airflow-core/tests/integration/otel/test_otel.py::TestOtelIntegration::test_scheduler_change_after_the_first_task_finishes - AssertionError: Span name 'task2' wasn't found in children span names. It's not a child of span 'otel_test_dag_with_pause_between_tasks'.
===================================================================== 1 failed, 7 passed, 10 skipped, 1 warning in 417.55s (0:06:57) =====================================================================full traceback:
Traceback (most recent call last):
File "/usr/python/bin/airflow", line 10, in <module>
sys.exit(main())
File "/opt/airflow/airflow-core/src/airflow/__main__.py", line 55, in main
args.func(args)
File "/opt/airflow/airflow-core/src/airflow/cli/cli_config.py", line 49, in command
return func(*args, **kwargs)
File "/opt/airflow/airflow-core/src/airflow/utils/cli.py", line 113, in wrapper
return f(*args, **kwargs)
File "/opt/airflow/airflow-core/src/airflow/utils/providers_configuration_loader.py", line 54, in wrapped_function
return func(*args, **kwargs)
File "/opt/airflow/airflow-core/src/airflow/cli/commands/api_server_command.py", line 120, in wrapper
return func(args)
File "/opt/airflow/airflow-core/src/airflow/cli/commands/api_server_command.py", line 164, in api_server
run_command_with_daemon_option(
File "/opt/airflow/airflow-core/src/airflow/cli/commands/daemon_utils.py", line 58, in run_command_with_daemon_option
check_if_pidfile_process_is_running(pid_file=pid, process_name=process_name)
File "/opt/airflow/airflow-core/src/airflow/utils/process_utils.py", line 370, in check_if_pidfile_process_is_running
raise AirflowException(f"The {process_name} is already running under PID {pid}.")
airflow.sdk.exceptions.AirflowException: The api_server is already running under PID 124.
2026-01-26T13:32:15.730154Z [error ] Exception while exporting metrics HTTPConnectionPool(host='breeze-otel-collector', port=4318): Max retries exceeded with url: /v1/metrics (Caused by NameResolutionError("HTTPConnection(host='breeze-otel-collector', port=4318): Failed to resolve 'breeze-otel-collector' ([Errno -3] Temporary failure in name resolution)")) [opentelemetry.sdk.metrics._internal.export] loc=__init__.py:545
Traceback (most recent call last):
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 204, in _new_conn
sock = connection.create_connection(
File "/usr/python/lib/python3.10/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/python/lib/python3.10/socket.py", line 967, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/python/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
response = self._make_request(
File "/usr/python/lib/python3.10/site-packages/urllib3/connectionpool.py", line 493, in _make_request
conn.request(
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 500, in request
self.endheaders()
File "/usr/python/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/python/lib/python3.10/http/client.py", line 1038, in _send_output
self.send(msg)
File "/usr/python/lib/python3.10/http/client.py", line 976, in send
self.connect()
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 331, in connect
self.sock = self._new_conn()
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 211, in _new_conn
raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: HTTPConnection(host='breeze-otel-collector', port=4318): Failed to resolve 'breeze-otel-collector' ([Errno -3] Temporary failure in name resolution)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/python/lib/python3.10/site-packages/requests/adapters.py", line 644, in send
resp = conn.urlopen(
File "/usr/python/lib/python3.10/site-packages/urllib3/connectionpool.py", line 841, in urlopen
retries = retries.increment(
File "/usr/python/lib/python3.10/site-packages/urllib3/util/retry.py", line 535, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='breeze-otel-collector', port=4318): Max retries exceeded with url: /v1/metrics (Caused by NameResolutionError("HTTPConnection(host='breeze-otel-collector', port=4318): Failed to resolve 'breeze-otel-collector' ([Errno -3] Temporary failure in name resolution)"))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/python/lib/python3.10/site-packages/opentelemetry/sdk/metrics/_internal/export/__init__.py", line 541, in _receive_metrics
self._exporter.export(
File "/usr/python/lib/python3.10/site-packages/opentelemetry/exporter/otlp/proto/http/metric_exporter/__init__.py", line 203, in export
resp = self._export(serialized_data.SerializeToString())
File "/usr/python/lib/python3.10/site-packages/opentelemetry/exporter/otlp/proto/http/metric_exporter/__init__.py", line 173, in _export
return self._session.post(
File "/usr/python/lib/python3.10/site-packages/requests/sessions.py", line 637, in post
return self.request("POST", url, data=data, json=json, **kwargs)
File "/usr/python/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/usr/python/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/usr/python/lib/python3.10/site-packages/requests/adapters.py", line 677, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='breeze-otel-collector', port=4318): Max retries exceeded with url: /v1/metrics (Caused by NameResolutionError("HTTPConnection(host='breeze-otel-collector', port=4318): Failed to resolve 'breeze-otel-collector' ([Errno -3] Temporary failure in name resolution)"))
/usr/python/lib/python3.10/site-packages/celery/platforms.py:841 SecurityWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!
Please specify a different user using the --uid option.
User information: uid=0 euid=0 gid=0 egid=0
2026-01-26T13:32:17.299274Z [info ] Connected to redis://redis:6379/0 [celery.worker.consumer.connection] loc=connection.py:22
2026-01-26T13:32:17.305338Z [info ] mingle: searching for neighbors [celery.worker.consumer.mingle] loc=mingle.py:40
2026-01-26T13:32:18.316065Z [info ] mingle: all alone [celery.worker.consumer.mingle] loc=mingle.py:49
2026-01-26T13:32:18.330157Z [info ] celery@753b8cbd20cb ready. [celery.apps.worker] loc=worker.py:176
2026-01-26T13:32:19.434136Z [error ] Exception while exporting metrics HTTPConnectionPool(host='breeze-otel-collector', port=4318): Max retries exceeded with url: /v1/metrics (Caused by NameResolutionError("HTTPConnection(host='breeze-otel-collector', port=4318): Failed to resolve 'breeze-otel-collector' ([Errno -3] Temporary failure in name resolution)")) [opentelemetry.sdk.metrics._internal.export] loc=__init__.py:545
Traceback (most recent call last):
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 204, in _new_conn
sock = connection.create_connection(
File "/usr/python/lib/python3.10/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/python/lib/python3.10/socket.py", line 967, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/python/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
response = self._make_request(
File "/usr/python/lib/python3.10/site-packages/urllib3/connectionpool.py", line 493, in _make_request
conn.request(
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 500, in request
self.endheaders()
File "/usr/python/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/python/lib/python3.10/http/client.py", line 1038, in _send_output
self.send(msg)
File "/usr/python/lib/python3.10/http/client.py", line 976, in send
self.connect()
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 331, in connect
self.sock = self._new_conn()
File "/usr/python/lib/python3.10/site-packages/urllib3/connection.py", line 211, in _new_conn
raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: HTTPConnection(host='breeze-otel-collector', port=4318): Failed to resolve 'breeze-otel-collector' ([Errno -3] Temporary failure in name resolution)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area:Schedulerincluding HA (high availability) schedulerincluding HA (high availability) schedulerkind:bugThis is a clearly a bugThis is a clearly a bugtelemetryTelemetry-related issuesTelemetry-related issues
Type
Projects
Status
Done