-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Closed
Labels
area:providersgood first issuekind:bugThis is a clearly a bugThis is a clearly a bugprovider:databricks
Description
Apache Airflow Provider(s)
databricks
Versions of Apache Airflow Providers
As we are running DatabricksSubmitRunOperator to run multi task databricks job as I am using airflow providers with mostly all flavours of versions, but when the databricks job get failed, DatabricksSubmitRunOperator gives below error its because this operator running get-output API, hence taking job run id instead of taking task run id
Error
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/airflow/providers/databricks/hooks/databricks_base.py", line 355, in _do_api_call
for attempt in self._get_retry_object():
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/tenacity/__init__.py", line 382, in __iter__
do = self.iter(retry_state=retry_state)
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/tenacity/__init__.py", line 349, in iter
return fut.result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/airflow/providers/databricks/hooks/databricks_base.py", line 365, in _do_api_call
response.raise_for_status()
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/airflow/providers/databricks/operators/databricks.py", line 375, in execute
_handle_databricks_operator_execution(self, hook, self.log, context)
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/airflow/providers/databricks/operators/databricks.py", line 90, in _handle_databricks_operator_execution
run_output = hook.get_run_output(operator.run_id)
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/airflow/providers/databricks/hooks/databricks.py", line 280, in get_run_output
run_output = self._do_api_call(OUTPUT_RUNS_JOB_ENDPOINT, json)
File "/home/ubuntu/.venv/airflow/lib/python3.8/site-packages/airflow/providers/databricks/hooks/databricks_base.py", line 371, in _do_api_call
raise AirflowException(
airflow.exceptions.AirflowException: Response: b'{"error_code":"INVALID_PARAMETER_VALUE","message":"Retrieving the output of runs with multiple tasks is not supported. Please retrieve the output of each individual task run instead."}', Status Code: 400
[2023-01-10, 05:15:12 IST] {taskinstance.py} INFO - Marking task as FAILED. dag_id=experiment_metrics_store_experiment_4, task_id=, execution_date=20230109T180804, start_date=20230109T180810, end_date=20230109T181512
[2023-01-10, 05:15:13 IST] {warnings.py} WARNING - /home/ubuntu/.venv/airflow/lib/python3.8/site-packages/airflow/utils/email.py:119: PendingDeprecationWarning: Fetching SMTP credentials from configuration variables will be deprecated in a future release. Please set credentials using a connection instead.
send_mime_email(e_from=mail_from, e_to=recipients, mime_msg=msg, conn_id=conn_id, dryrun=dryrun)Apache Airflow version
2.3.2
Operating System
macos
Deployment
Other
Deployment details
No response
What happened
No response
What you think should happen instead
No response
How to reproduce
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
area:providersgood first issuekind:bugThis is a clearly a bugThis is a clearly a bugprovider:databricks