Crawler for Externally Orchestrated Jobs with Failing Configuration#395

Closed

zpappa wants to merge 2 commits intomainfrom

feature/external-orchestrator-job-run-crawler

zpappa commented Oct 6, 2023

Resolves #266

Added ExternallyOrchestratedJobsWithFailingConfigCrawler
Added a crawler to look at JobRuns from the SDK and determine which of the job runs are from the RunsSubmit API

Added Unit Tests
Added tests to cover basic logic and some edge cases

Integration Tests Pending

zpappa requested review from a team

October 6, 2023 16:40

zpappa mentioned this pull request

Crawler for RunSubmit API usages from External Orchestrators (ADF/Airflow) #366

Closed

codecov bot commented Oct 6, 2023 •

edited

Loading

Codecov Report

Attention: 52 lines in your changes are missing coverage. Please review.

Comparison is base (606dd72) 85.67% compared to head (02dfe85) 82.61%.
Report is 1 commits behind head on main.

❗ Current head 02dfe85 differs from pull request most recent head 7801cc8. Consider uploading reports for the commit 7801cc8 to get more accurate results

Files	Patch %	Lines
src/databricks/labs/ucx/assessment/crawlers.py	55.17%	34 Missing and 18 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #395      +/-   ##
==========================================
- Coverage   85.67%   82.61%   -3.07%     
==========================================
  Files          42       30      -12     
  Lines        5311     2490    -2821     
  Branches      969      445     -524     
==========================================
- Hits         4550     2057    -2493     
+ Misses        542      326     -216     
+ Partials      219      107     -112

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nfx requested changes

View reviewed changes

Collaborator

nfx left a comment

needs a passing integration test. please attach a screenshot once you get it to work locally

pohlposition added the step/assessment label

pohlposition added this to the 1 week milestone

FastLee requested changes

View reviewed changes

src/databricks/labs/ucx/assessment/crawlers.py Outdated

		failures: str


		@dataclass

Contributor

FastLee Oct 9, 2023

Do we want to capture only the failing ones or all of them with the failing one anotated?

src/databricks/labs/ucx/assessment/crawlers.py Outdated

+              def is_custom_image(version_string: str):
+                  """
+                  Is this a custom version?
+                  """

Contributor

FastLee Oct 9, 2023

Does it require implementation?

src/databricks/labs/ucx/assessment/crawlers.py Outdated

+                  pattern = r"^(?P<major>\d+)?\.(?P<minor>\d+)?\.(?P<patch>[\dx]+)?.*"
+                  lvg = re.match(pattern, left_version)
+                  rvg = re.match(pattern, right_version)
+                  left = (int(lvg.group("major")), int(lvg.group("minor")))

Contributor

FastLee Oct 9, 2023

Do we have a proper unit test for that?

src/databricks/labs/ucx/assessment/crawlers.py Outdated



		def get_job_cluster_from_task(
		task: RunTask, job_run: BaseRun, all_clusters: dict[str, ClusterDetails]

Contributor

FastLee Oct 9, 2023

Re factor job cluster to use the same mechanism

src/databricks/labs/ucx/assessment/crawlers.py Outdated

+                      self._ws = ws
+                  def _crawl(self) -> list[ExternallyOrchestratedJobRunWithFailingConfiguration]:
+                      no_of_days_back = datetime.timedelta(days=30)  # todo make configurable in yaml?

Contributor

FastLee Oct 9, 2023

Make timedelta externally configurable

nfx assigned zpappa

nfx added the step/assign metastore label

zpappa force-pushed the feature/external-orchestrator-job-run-crawler branch from 02dfe85 to ea103b4 Compare

October 20, 2023 13:02

zpappa had a problem deploying to account-admin

October 20, 2023 13:03

— with

GitHub Actions Failure


          Checking in changes for feature

14daf86

zpappa force-pushed the feature/external-orchestrator-job-run-crawler branch from ea103b4 to 14daf86 Compare

October 20, 2023 13:03

zpappa had a problem deploying to account-admin

October 20, 2023 13:03

— with

GitHub Actions Failure

CLAassistant commented Nov 27, 2023 •

edited

Loading

All committers have signed the CLA.

dipankarkush-db assigned dipankarkush-db and unassigned zpappa

nfx assigned renardeinside and unassigned dipankarkush-db

FastLee closed this

nfx reopened this


          Merge remote-tracking branch 'origin/main' into feature/external-orch…

7801cc8

…estrator-job-run-crawler

# Conflicts:
#	src/databricks/labs/ucx/assessment/crawlers.py
#	tests/unit/assessment/test_assessment.py

renardeinside had a problem deploying to account-admin

January 29, 2024 16:59

— with

GitHub Actions Failure

nfx mentioned this pull request

Added assessment for the incompatible RunSubmit API usages #849

Merged

nfx closed this

databrickslabs locked and limited conversation to collaborators

nfx deleted the feature/external-orchestrator-job-run-crawler branch

April 4, 2024 22:32

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

step/assessment step/assign metastore