-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Description
Apache Airflow version
2.1.1
Operating System
Ubuntu
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
I tried to run a simple Spark application using the SparkKubernetesOperator and SparkKubernetesSensor.
In the yaml file for the Spark Operator I added a sidecar container to the driver pod.
When the job runs in Airflow the SparkKubernetesSensor step will fail with the following error:
[2021-09-23 13:24:21,547] {spark_kubernetes.py:92} WARNING - Could not read logs for pod pyspark-pi-driver. It may have been disposed.
Make sure timeToLiveSeconds is set on your SparkApplication spec.
underlying exception: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Thu, 23 Sep 2021 13:24:21 GMT', 'Content-Length': '233'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"a container name must be specified for pod pyspark-pi-driver, choose one of: [spark-kubernetes-driver logging-sidecar]","reason":"BadRequest","code":400}\n'
In my yaml file I am not setting timeToLiveSeconds so the driver pod is still around at the end of the job execution, so there should be no issues fetching the logs.
I believe the error is due to the fact that in the call to get_pod_logs, from within SparkKubernetesSensor._log_driver, only the driver pod name is sent and not any container name. This syntax works fine if the driver container is alone in the pod, but it will throw an error if there are multiple containers inside the pod.
I'm attaching my DAG and yaml files.
spark-py-pi-dag-and-yaml.tar.gz
What you expected to happen
The SparkKubernetesSensor should be able to get the driver container logs even if there are sidecar containers running along side the driver.
How to reproduce
The attached YAML and DAG definition can be used to reproduce the issue.
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct