-
Notifications
You must be signed in to change notification settings - Fork 101
Introduce ObjectFailure with step, object_type and object_id fields to find the causes of failures quicker for different stages of the workflow #445
Copy link
Copy link
Closed
Labels
cloud/azureissues related to Azureissues related to AzureenhancementNew feature or requestNew feature or requestmigrate/clustersgo/uc/upgrade Upgrade Interactive Clustersgo/uc/upgrade Upgrade Interactive Clustersstep/assessmentgo/uc/upgrade - Assessment Stepgo/uc/upgrade - Assessment Steptech debtchores and design flawschores and design flaws
Description
For example, currently there's a failure with cluster policy retrieval now:
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/framework/tasks.py", line 143, in trigger
current_task.fn(cfg)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/runtime.py", line 153, in assess_azure_service_principals
crawler.snapshot()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/crawlers.py", line 369, in snapshot
return self._snapshot(self._try_fetch, self._crawl)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/framework/crawlers.py", line 244, in _snapshot
loaded_records = list(loader())
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/crawlers.py", line 182, in _crawl
all_relevant_service_principals = self._get_relevant_service_principals()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/crawlers.py", line 277, in _get_relevant_service_principals
temp_list = self._list_all_jobs_with_spn_in_spark_conf()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/crawlers.py", line 297, in _list_all_jobs_with_spn_in_spark_conf
policy = self._ws.cluster_policies.get(cluster_config.policy_id)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/sdk/service/compute.py", line 3589, in get
res = self._api.do('GET', '/api/2.0/policies/clusters/get', query=query, headers=headers)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/sdk/core.py", line 1061, in do
return retryable(self._perform)(method,
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/sdk/retries.py", line 47, in wrapper
raise err
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/sdk/retries.py", line 29, in wrapper
return func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/sdk/core.py", line 1150, in _perform
raise self._make_nicer_error(response=response, **payload) from None
databricks.sdk.core.DatabricksError: Can't find a cluster policy with id: XXXXX.
but we don't see if it's policy issue with a certain cluster or job or pipeline and we don't know which pipeline. We need to refactor any exceptions for #406 to have a good effect.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
cloud/azureissues related to Azureissues related to AzureenhancementNew feature or requestNew feature or requestmigrate/clustersgo/uc/upgrade Upgrade Interactive Clustersgo/uc/upgrade Upgrade Interactive Clustersstep/assessmentgo/uc/upgrade - Assessment Stepgo/uc/upgrade - Assessment Steptech debtchores and design flawschores and design flaws
Type
Projects
Status
No status