-
Notifications
You must be signed in to change notification settings - Fork 101
Assessment for RunSubmit API usages #266
Copy link
Copy link
Closed
Labels
cloud/azureissues related to Azureissues related to AzureenhancementNew feature or requestNew feature or requestfeat/crawlerstep/assessmentgo/uc/upgrade - Assessment Stepgo/uc/upgrade - Assessment Stepstep/assign metastorego/uc/upgrade Assign Metastorego/uc/upgrade Assign Metastore
Description
Problem statement
Existing processes that create RunSubmit Jobs (the so-called ephemeral jobs) with the following properties:
- DBR > 11.X
- [AWS-specific] Uses
instance_profile - Has no
data_security_modeproperty specified in the job creation request - [Azure-specific] some configurations in the
spark_conf(To be confirmed)
May be broken when UC is enabled in a given workspace (meaning when workspace is assigned to the metastore).
Why this happens?
When UC is enabled, all RunSubmit Jobs with DBR 11.X+ will by default UC-enabled. If there is a conflict in permissions between UC and service principal (e.g. instance profile), the job will fail.
The reason and the change is described here.
How can we identify the identical RunSubmit?
This requires additional internal discussion.
TODO:
- get a list of all (persisted) jobs
- get a list of all job runs
- find job runs that have no persisted job (out of workflow)
- group all job runs from non-persisted jobs to approximate the number of unique airflow/azure data factory DAGs.
- identify job runs that do not include data_security_mode in the job creation request and are run against 11.x compute
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
cloud/azureissues related to Azureissues related to AzureenhancementNew feature or requestNew feature or requestfeat/crawlerstep/assessmentgo/uc/upgrade - Assessment Stepgo/uc/upgrade - Assessment Stepstep/assign metastorego/uc/upgrade Assign Metastorego/uc/upgrade Assign Metastore
Type
Projects
Status
No status