Fixed databricks labs ucx repair-run command to execute correctly#801
Fixed databricks labs ucx repair-run command to execute correctly#801
databricks labs ucx repair-run command to execute correctly#801Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #801 +/- ##
==========================================
+ Coverage 84.07% 84.13% +0.05%
==========================================
Files 39 39
Lines 4872 4890 +18
Branches 913 916 +3
==========================================
+ Hits 4096 4114 +18
Misses 564 564
Partials 212 212 ☔ View full report in Codecov by Sentry. |
nfx
left a comment
There was a problem hiding this comment.
Rewrite to retried decorator.
src/databricks/labs/ucx/install.py
Outdated
|
|
||
| while not state.result_state and (time.time() - start_time < timeout): | ||
| logger.info("Waiting for the result_state to update the state") | ||
| time.sleep(10) |
There was a problem hiding this comment.
This is not unit testable, see how we use retried() decorator in workspace access package (dbsql permissions, secrets acls, etc).
There was a problem hiding this comment.
@nfx .Updated the code with retried logic.
src/databricks/labs/ucx/install.py
Outdated
| latest_job_run = job_runs[0] | ||
| state = latest_job_run.state | ||
|
|
||
| while not state.result_state and (time.time() - start_time < timeout): |
There was a problem hiding this comment.
Refactor this into private method and decode with retried
There was a problem hiding this comment.
@nfx .Refactored the same with retried logic.
| def _get_result_state(self, job_id): | ||
| job_runs = list(self._ws.jobs.list_runs(job_id=job_id, limit=1)) | ||
| latest_job_run = job_runs[0] | ||
| if not latest_job_run.state.result_state: | ||
| logger.info("Waiting for the result_state to update the state") | ||
| time.sleep(10) | ||
| job_state = latest_job_run.state.result_state.value | ||
| return job_state |
There was a problem hiding this comment.
| def _get_result_state(self, job_id): | |
| job_runs = list(self._ws.jobs.list_runs(job_id=job_id, limit=1)) | |
| latest_job_run = job_runs[0] | |
| if not latest_job_run.state.result_state: | |
| logger.info("Waiting for the result_state to update the state") | |
| time.sleep(10) | |
| job_state = latest_job_run.state.result_state.value | |
| return job_state | |
| def _get_result_state(self, job_id): | |
| job_runs = list(self._ws.jobs.list_runs(job_id=job_id, limit=1)) | |
| if len(job_runs) == 0: | |
| raise AttributeError("no job runs found") | |
| latest_job_run = job_runs[0] | |
| if not latest_job_run.state.result_state: | |
| raise AttributeError("no result state in job run") | |
| job_state = latest_job_run.state.result_state.value | |
| return job_state |
There was a problem hiding this comment.
you have retried(on=[AttributeError], but don't throw it anywhere
There was a problem hiding this comment.
If latest_job_run.state is None then latest_job_run.state.result_state.value will throw AttributeError. But I have rewritten now to raise the exception.
For Job Runs during the initial stage itself we are exiting immediately if don't have any job run for the job_id with proper message.
# Conflicts: # src/databricks/labs/ucx/install.py
databricks labs ucx repair-rundatabricks labs ucx repair-run command to execute correctly
* Added `databricks labs ucx validate-groups-membership` command to validate groups to see if they have same membership across acount and workspace level ([#772](#772)). * Added baseline for getting Azure Resource Role Assignments ([#764](#764)). * Added issue and pull request templates ([#791](#791)). * Added linked issues to PR template ([#793](#793)). * Added optional `debug_truncate_bytes` parameter to the config and extend the default log truncation limit ([#782](#782)). * Added support for crawling grants and applying Hive Metastore UDF ACLs ([#812](#812)). * Changed Python requirement from 3.10.6 to 3.10 ([#805](#805)). * Extend error handling of delta issues in crawlers and hive metastore ([#795](#795)). * Fixed `databricks labs ucx repair-run` command to execute correctly ([#801](#801)). * Fixed handling of `DELTASHARING` table format ([#802](#802)). * Fixed listing of workflows via CLI ([#811](#811)). * Fixed logger import path for DEBUG notebook ([#792](#792)). * Fixed move table command to delete table/view regardless if permissions are present, skipping corrupted tables when crawling table size and making existing tests more stable ([#777](#777)). * Fixed the issue of `databricks labs ucx installations` and `databricks labs ucx manual-workspace-info` ([#814](#814)). * Increase the unit test coverage for cli.py ([#800](#800)). * Mount Point crawler lists /Volume with four variations which is confusing ([#779](#779)). * Updated README.md to remove mention of deprecated install.sh ([#781](#781)). * Updated `bug` issue template ([#797](#797)). * Fixed writing log readme in multiprocess safe way ([#794](#794)).
* Added `databricks labs ucx validate-groups-membership` command to validate groups to see if they have same membership across acount and workspace level ([#772](#772)). * Added baseline for getting Azure Resource Role Assignments ([#764](#764)). * Added issue and pull request templates ([#791](#791)). * Added linked issues to PR template ([#793](#793)). * Added optional `debug_truncate_bytes` parameter to the config and extend the default log truncation limit ([#782](#782)). * Added support for crawling grants and applying Hive Metastore UDF ACLs ([#812](#812)). * Changed Python requirement from 3.10.6 to 3.10 ([#805](#805)). * Extend error handling of delta issues in crawlers and hive metastore ([#795](#795)). * Fixed `databricks labs ucx repair-run` command to execute correctly ([#801](#801)). * Fixed handling of `DELTASHARING` table format ([#802](#802)). * Fixed listing of workflows via CLI ([#811](#811)). * Fixed logger import path for DEBUG notebook ([#792](#792)). * Fixed move table command to delete table/view regardless if permissions are present, skipping corrupted tables when crawling table size and making existing tests more stable ([#777](#777)). * Fixed the issue of `databricks labs ucx installations` and `databricks labs ucx manual-workspace-info` ([#814](#814)). * Increase the unit test coverage for cli.py ([#800](#800)). * Mount Point crawler lists /Volume with four variations which is confusing ([#779](#779)). * Updated README.md to remove mention of deprecated install.sh ([#781](#781)). * Updated `bug` issue template ([#797](#797)). * Fixed writing log readme in multiprocess safe way ([#794](#794)).
* Added `databricks labs ucx validate-groups-membership` command to validate groups to see if they have same membership across acount and workspace level ([#772](#772)). * Added baseline for getting Azure Resource Role Assignments ([#764](#764)). * Added issue and pull request templates ([#791](#791)). * Added linked issues to PR template ([#793](#793)). * Added optional `debug_truncate_bytes` parameter to the config and extend the default log truncation limit ([#782](#782)). * Added support for crawling grants and applying Hive Metastore UDF ACLs ([#812](#812)). * Changed Python requirement from 3.10.6 to 3.10 ([#805](#805)). * Extend error handling of delta issues in crawlers and hive metastore ([#795](#795)). * Fixed `databricks labs ucx repair-run` command to execute correctly ([#801](#801)). * Fixed handling of `DELTASHARING` table format ([#802](#802)). * Fixed listing of workflows via CLI ([#811](#811)). * Fixed logger import path for DEBUG notebook ([#792](#792)). * Fixed move table command to delete table/view regardless if permissions are present, skipping corrupted tables when crawling table size and making existing tests more stable ([#777](#777)). * Fixed the issue of `databricks labs ucx installations` and `databricks labs ucx manual-workspace-info` ([#814](#814)). * Increase the unit test coverage for cli.py ([#800](#800)). * Mount Point crawler lists /Volume with four variations which is confusing ([#779](#779)). * Updated README.md to remove mention of deprecated install.sh ([#781](#781)). * Updated `bug` issue template ([#797](#797)). * Fixed writing log readme in multiprocess safe way ([#794](#794)).
Changes
Fixing the issue for repair run CLI
databricks labs ucx repair-run. When a CLI tries to repair run a job before if updates its response json to either FAILED or SUCCESS it was failing with NoneType exception.Added a check in
repair_runinsideinstall.pyto check the status of the response and wait for 20 seconds to get it updated .Enhanced the code to repair run already repaired job.
Linked issues
closes #787
Resolves #787
Functionality
databricks labs ucx repair-runwhich was failing in regression testingTests
test_repair_run_result_stateintest_install.py