-
Notifications
You must be signed in to change notification settings - Fork 101
[EPIC]: Track migration process and display it in a dashboard #2074
Copy link
Copy link
Open
Labels
EPICfeat/vizvizualizing UCX progress as a redash/lakeview dashboardvizualizing UCX progress as a redash/lakeview dashboardstep/assessmentgo/uc/upgrade - Assessment Stepgo/uc/upgrade - Assessment Step
Description
Is there an existing issue for this?
- I have searched the existing issues
Problem statement
There's no easy way to know what still needs to be migrated to UC within a given workspace
Proposed Solution
- [FEATURE] Create
migration-progressscheduled daily workflow #2578 - [FEATURE] Support rerunning crawlers #2576
- [FEATURE] Update the inventory with the
migration-progresscrawled objects #2574 - [FEATURE] Persist the
migration-progresscrawled objects in theucx.historytable #2573- [FEATURE] Create a
ucxcatalog via a cli-command #2571 - [FEATURE] Create a
ucx.historytable #2572 - [FEATURE] Create a
ucx.workflow_runstable #2600 - [FEATURE] Create a
ucx.errorstable #2603 - [BUG]: Dashboard and workflow refresh during the
migration-progress-experimentaljob do not update the historical log #3237 - [TODO] bring back self.tables_migrator.index() #3153
- [FEATURE] Create a
- [FEATURE] Skip the
migration-progressrun when theucxcatalog does not exists #2577 - [FEATURE] Skip the
migration-progressrun when theassessmentjob did not run yet #2816 - [FEATURE] Set
RemoveAfterproperty onucxcatalogs in integration test #2594- Pre-requisite: Change in watchdog
- [FEATURE] Run the workflow static analysis as part of the
migration progressworkflow #2595 - [FEATURE] Visualize migration process in dashboard #2596
- [FEATURE] Encode dataclasses to a history log entry #3064
- [FEATURE]: History log encoder for clusters #3057
- [FEATURE]: History log encoder for grants #3058
- [FEATURE]: History log encoder for jobs #3059
- [FEATURE]: History log encoder for pipelines #3060
- [FEATURE]: History log encoder for tables #3061
- [FEATURE]: History log encoder for udfs #3062
- [FEATURE]: History log encoder for cluster policies #3063
- Dashboards
UsedTables- Add used tables to table/views as failure when referencing non-migrated table
Potential challenges
- [FEATURE] Let crawlers support
appendto tables #2597 -
but for migration progress purposes, we need to overwrite the tables. or add another column with a timestamp and modify "fetch latest" queries to fetch the latest timestamp of the snapshot. fetching the latest timestamp from the snapshot allows to build a bar-chart widget to see how fast migration progresses, but we don't really care about it it. if we do fetch-latest-timestamp, all our views and dashboards would become a bit more complicated. but that's fine.--> Decision is made to keep the current ucx inventory and store history in a separate table (see proposed solution above) -
let's keep the status of migration progress in HMS (for now), but we can change this decision in a few weeks.--> Decision is made to store the migration process in a ucx catalog.
Migration process crawlers
Assessment tasks that make sense to re-run on migration-progress workflow:
crawl_tablesassess_jobs- potentially harden the code there as wellassess_clusters- potentially harden as wellassess_pipelines- potentially harden as wellcrawl_cluster_policiesassess_global_init_scripts
not to be re-run:
crawl_mounts- we already pre-created external locationssetup_tacl- we don't need to crawl grantscrawl_grants- no need to, i thinkestimate_table_size_for_migration- most likely not necessaryguess_external_locations- we already migrated external locations by this pointassess_incompatible_submit_runs- not going to be necessary in septemberworkspace_listing- we are going to analyse only those notebooks that are part of jobs in the scope of static analysiscrawl_permissions- we expect permissions to already be migratedcrawl_groups- we expect groups to already be migrated
Additional Context
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
EPICfeat/vizvizualizing UCX progress as a redash/lakeview dashboardvizualizing UCX progress as a redash/lakeview dashboardstep/assessmentgo/uc/upgrade - Assessment Stepgo/uc/upgrade - Assessment Step
Type
Projects
Status
Todo