Skip to content

Crawler for Externally Orchestrated Jobs with Failing Configuration#395

Closed
zpappa wants to merge 2 commits intomainfrom
feature/external-orchestrator-job-run-crawler
Closed

Crawler for Externally Orchestrated Jobs with Failing Configuration#395
zpappa wants to merge 2 commits intomainfrom
feature/external-orchestrator-job-run-crawler

Conversation

@zpappa
Copy link
Copy Markdown

@zpappa zpappa commented Oct 6, 2023

Resolves #266

Added ExternallyOrchestratedJobsWithFailingConfigCrawler
Added a crawler to look at JobRuns from the SDK and determine which of the job runs are from the RunsSubmit API

Added Unit Tests
Added tests to cover basic logic and some edge cases

Integration Tests Pending

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 6, 2023

Codecov Report

Attention: 52 lines in your changes are missing coverage. Please review.

Comparison is base (606dd72) 85.67% compared to head (02dfe85) 82.61%.
Report is 1 commits behind head on main.

❗ Current head 02dfe85 differs from pull request most recent head 7801cc8. Consider uploading reports for the commit 7801cc8 to get more accurate results

Files Patch % Lines
src/databricks/labs/ucx/assessment/crawlers.py 55.17% 34 Missing and 18 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #395      +/-   ##
==========================================
- Coverage   85.67%   82.61%   -3.07%     
==========================================
  Files          42       30      -12     
  Lines        5311     2490    -2821     
  Branches      969      445     -524     
==========================================
- Hits         4550     2057    -2493     
+ Misses        542      326     -216     
+ Partials      219      107     -112     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a passing integration test. please attach a screenshot once you get it to work locally

@pohlposition pohlposition added the step/assessment go/uc/upgrade - Assessment Step label Oct 6, 2023
@pohlposition pohlposition added this to the 1 week milestone Oct 6, 2023
failures: str


@dataclass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to capture only the failing ones or all of them with the failing one anotated?

def is_custom_image(version_string: str):
"""
Is this a custom version?
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it require implementation?

pattern = r"^(?P<major>\d+)?\.(?P<minor>\d+)?\.(?P<patch>[\dx]+)?.*"
lvg = re.match(pattern, left_version)
rvg = re.match(pattern, right_version)
left = (int(lvg.group("major")), int(lvg.group("minor")))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a proper unit test for that?



def get_job_cluster_from_task(
task: RunTask, job_run: BaseRun, all_clusters: dict[str, ClusterDetails]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re factor job cluster to use the same mechanism

self._ws = ws

def _crawl(self) -> list[ExternallyOrchestratedJobRunWithFailingConfiguration]:
no_of_days_back = datetime.timedelta(days=30) # todo make configurable in yaml?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make timedelta externally configurable

@nfx nfx added the step/assign metastore go/uc/upgrade Assign Metastore label Oct 19, 2023
@zpappa zpappa force-pushed the feature/external-orchestrator-job-run-crawler branch from 02dfe85 to ea103b4 Compare October 20, 2023 13:02
@zpappa zpappa force-pushed the feature/external-orchestrator-job-run-crawler branch from ea103b4 to 14daf86 Compare October 20, 2023 13:03
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Nov 27, 2023

CLA assistant check
All committers have signed the CLA.

@nfx nfx assigned renardeinside and unassigned dipankarkush-db Dec 14, 2023
@FastLee FastLee closed this Jan 26, 2024
@nfx nfx reopened this Jan 29, 2024
…estrator-job-run-crawler

# Conflicts:
#	src/databricks/labs/ucx/assessment/crawlers.py
#	tests/unit/assessment/test_assessment.py
@nfx nfx closed this Jan 30, 2024
@databrickslabs databrickslabs locked and limited conversation to collaborators Jan 30, 2024
@nfx nfx deleted the feature/external-orchestrator-job-run-crawler branch April 4, 2024 22:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

step/assessment go/uc/upgrade - Assessment Step step/assign metastore go/uc/upgrade Assign Metastore

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Assessment for RunSubmit API usages

7 participants