Skip to content

Added databricks labs ucx validate-external-locations command for cli#715

Merged
nfx merged 19 commits intomainfrom
feature/external_location_validations
Dec 21, 2023
Merged

Added databricks labs ucx validate-external-locations command for cli#715
nfx merged 19 commits intomainfrom
feature/external_location_validations

Conversation

@HariGS-DB
Copy link
Copy Markdown
Contributor

@HariGS-DB HariGS-DB commented Dec 20, 2023

Description:
This change contains steps to validate external location defined in the external tables with the ones already available in the unity external location. It prints the count of tables that can be migrated for external location already present. Also shares the list of external location that needs to be created for the missing ones. It generates a tf file

Changes:
labs.yml - to add the new command
cli.py - to add code for invoking the new command
locations.py - add a field which gives count of tables using the external location. also added logic to identify duplicate for jdbc connection objects, functionality to compare locations and print details and generate tf file
updated test cases to locations
added new test cases for mapping

@HariGS-DB HariGS-DB requested a review from a team December 20, 2023 16:03
@HariGS-DB HariGS-DB linked an issue Dec 20, 2023 that may be closed by this pull request
@HariGS-DB HariGS-DB requested review from nfx and priyal-c and removed request for priyal-c December 20, 2023 16:03
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 20, 2023

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Comparison is base (2168f20) 79.52% compared to head (84a3f6e) 79.73%.

❗ Current head 84a3f6e differs from pull request most recent head 5674241. Consider uploading reports for the commit 5674241 to get more accurate results

Files Patch % Lines
src/databricks/labs/ucx/cli.py 18.18% 9 Missing ⚠️
...rc/databricks/labs/ucx/hive_metastore/locations.py 94.59% 2 Missing and 2 partials ⚠️
src/databricks/labs/ucx/install.py 50.00% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #715      +/-   ##
==========================================
+ Coverage   79.52%   79.73%   +0.20%     
==========================================
  Files          42       42              
  Lines        4294     4376      +82     
  Branches      790      807      +17     
==========================================
+ Hits         3415     3489      +74     
- Misses        675      679       +4     
- Partials      204      208       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@HariGS-DB HariGS-DB changed the title Validate External Location Add validate_external_location cli command to validate matching external location and generate tf script for missing locations Dec 20, 2023
Copy link
Copy Markdown
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix bugs and change the PR title to mention the full command as users would invoke it.

@HariGS-DB HariGS-DB changed the title Add validate_external_location cli command to validate matching external location and generate tf script for missing locations Added databricks labs ucx validate-external-locations commands for cli Dec 21, 2023
@HariGS-DB HariGS-DB changed the title Added databricks labs ucx validate-external-locations commands for cli Added databricks labs ucx validate-external-locations commands for cli Dec 21, 2023
@nfx nfx changed the title Added databricks labs ucx validate-external-locations commands for cli Added databricks labs ucx validate-external-locations command for cli Dec 21, 2023
Copy link
Copy Markdown
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nfx nfx merged commit 46ea0f1 into main Dec 21, 2023
@nfx nfx deleted the feature/external_location_validations branch December 21, 2023 15:47
nfx added a commit that referenced this pull request Dec 21, 2023
* Added `databricks labs ucx create-table-mapping` and `databricks labs ucx manual-workspace-info` commands for CLI ([#682](#682)).
* Added `databricks labs ucx installations` command ([#679](#679)).
* Added `databricks labs ucx skip --schema ... --table ...` command to mark table/schema for skipping in the table migration process ([#680](#680)).
* Added `databricks labs ucx validate-external-locations` command for cli ([#715](#715)).
* Added `workspace_group_name` and `account_group_name` to `make_ucx_group` fixture ([#664](#664)).
* Added capturing `ANY FILE` and `ANONYMOUS FUNCTION` grants ([#653](#653)).
* Added cluster override and handle case of write protected DBFS ([#610](#610)).
* Added cluster policy selector in the installer ([#655](#655)).
* Added detailed UCX pre-requisite information to README.md ([#689](#689)).
* Added filters to run only cloud specific task ([#681](#681)).
* Added interactive wizard for `databricks labs uninstall ucx` command ([#657](#657)).
* Added more granular error retry logic ([#704](#704)).
* Added parallel fetching of registered model identifiers to speed-up assessment workflow ([#691](#691)).
* Added retry on workspace listing ([#659](#659)).
* Added support for mapping workspace group to account group by prefix/suffix/regex/external id ([#650](#650)).
* Changed cluster security mode from NONE to LEGACY_SINGLE_USER, as `crawl_tables` was failing when run on non-UC Workspace in No Isolation mode with unable to access the config file ([#661](#661)).
* Changed the fields of the table "Tables" to lower case ([#684](#684)).
* Cleanup README.md ([#695](#695)).
* Cleanup installer framework and speed up test execution ([#711](#711)).
* Decouple group configuration from `install.py` ([#714](#714)).
* Enabled integration tests for `EXTERNAL` table migrations ([#677](#677)).
* Enforced `mypy` validation ([#713](#713)).
* Filtering out inventory database from loading into tables and filtering out the same from grant detail view ([#705](#705)).
* Fixed documentation for `reflect_account_groups_on_workspace` task and updated `CONTRIBUTING.md` guide ([#654](#654)).
* Fixed for secret scope apply task always raises ValueError ([#683](#683)).
* Fixed some flaky integration tests ([#700](#700)).
* More `mypy` chores ([#697](#697)).
* Moved `ExternalLocations` and `Mounts` to `locations` module ([#692](#692)).
* New CLI command for workspace mapping ([#678](#678)).
* No more `mypy` chores after this ([#699](#699)).
* Reduce server load for getting workspace groups and their members ([#666](#666)).
* Some `mypy` chores ([#696](#696)).
* Throwing ManyError on migrate-groups tasks ([#710](#710)).
* Updated installation documentation to use Databricks CLI ([#686](#686)).

Dependency updates:

 * Updated databricks-sdk requirement from ~=0.13.0 to ~=0.14.0 ([#651](#651)).
 * Updated databricks-sdk requirement from ~=0.14.0 to ~=0.15.0 ([#687](#687)).
 * Updated databricks-sdk requirement from ~=0.15.0 to ~=0.16.0 ([#712](#712)).
@nfx nfx mentioned this pull request Dec 21, 2023
nfx added a commit that referenced this pull request Dec 21, 2023
* Added `databricks labs ucx create-table-mapping` and `databricks labs ucx manual-workspace-info` commands for CLI ([#682](#682)).
* Added `databricks labs ucx ensure-assessment-run` to CLI commands ([#708](#708)).
* Added `databricks labs ucx installations` command ([#679](#679)).
* Added `databricks labs ucx skip --schema ... --table ...` command to mark table/schema for skipping in the table migration process ([#680](#680)).
* Added `databricks labs ucx validate-external-locations` command for cli ([#715](#715)).
* Added capturing `ANY FILE` and `ANONYMOUS FUNCTION` grants ([#653](#653)).
* Added cluster override and handle case of write protected DBFS ([#610](#610)).
* Added cluster policy selector in the installer ([#655](#655)).
* Added detailed UCX pre-requisite information to README.md ([#689](#689)).
* Added interactive wizard for `databricks labs uninstall ucx` command ([#657](#657)).
* Added more granular error retry logic ([#704](#704)).
* Added parallel fetching of registered model identifiers to speed-up assessment workflow ([#691](#691)).
* Added retry on workspace listing ([#659](#659)).
* Added support for mapping workspace group to account group by prefix/suffix/regex/external id ([#650](#650)).
* Changed cluster security mode from NONE to LEGACY_SINGLE_USER, as `crawl_tables` was failing when run on non-UC Workspace in No Isolation mode with unable to access the config file ([#661](#661)).
* Changed the fields of the table "Tables" to lower case ([#684](#684)).
* Enabled integration tests for `EXTERNAL` table migrations ([#677](#677)).
* Enforced `mypy` validation ([#713](#713)).
* Filtering out inventory database from loading into tables and filtering out the same from grant detail view ([#705](#705)).
* Fixed documentation for `reflect_account_groups_on_workspace` task and updated `CONTRIBUTING.md` guide ([#654](#654)).
* Fixed secret scope apply task to raise ValueError ([#683](#683)).
* Fixed legacy table ACL ownership migration and other integration testing issues ([#722](#722)).
* Fixed some flaky integration tests ([#700](#700)).
* New CLI command for workspace mapping ([#678](#678)).
* Reduce server load for getting workspace groups and their members ([#666](#666)).
* Throwing ManyError on migrate-groups tasks ([#710](#710)).
* Updated installation documentation to use Databricks CLI ([#686](#686)).

Dependency updates:

 * Updated databricks-sdk requirement from ~=0.13.0 to ~=0.14.0 ([#651](#651)).
 * Updated databricks-sdk requirement from ~=0.14.0 to ~=0.15.0 ([#687](#687)).
 * Updated databricks-sdk requirement from ~=0.15.0 to ~=0.16.0 ([#712](#712)).
@nfx nfx mentioned this pull request Dec 21, 2023
nfx added a commit that referenced this pull request Dec 21, 2023
* Added `databricks labs ucx create-table-mapping` and `databricks labs
ucx manual-workspace-info` commands for CLI
([#682](#682)).
* Added `databricks labs ucx ensure-assessment-run` to CLI commands
([#708](#708)).
* Added `databricks labs ucx installations` command
([#679](#679)).
* Added `databricks labs ucx skip --schema ... --table ...` command to
mark table/schema for skipping in the table migration process
([#680](#680)).
* Added `databricks labs ucx validate-external-locations` command for
cli ([#715](#715)).
* Added capturing `ANY FILE` and `ANONYMOUS FUNCTION` grants
([#653](#653)).
* Added cluster override and handle case of write protected DBFS
([#610](#610)).
* Added cluster policy selector in the installer
([#655](#655)).
* Added detailed UCX pre-requisite information to README.md
([#689](#689)).
* Added interactive wizard for `databricks labs uninstall ucx` command
([#657](#657)).
* Added more granular error retry logic
([#704](#704)).
* Added parallel fetching of registered model identifiers to speed-up
assessment workflow
([#691](#691)).
* Added retry on workspace listing
([#659](#659)).
* Added support for mapping workspace group to account group by
prefix/suffix/regex/external id
([#650](#650)).
* Changed cluster security mode from NONE to LEGACY_SINGLE_USER, as
`crawl_tables` was failing when run on non-UC Workspace in No Isolation
mode with unable to access the config file
([#661](#661)).
* Changed the fields of the table "Tables" to lower case
([#684](#684)).
* Enabled integration tests for `EXTERNAL` table migrations
([#677](#677)).
* Enforced `mypy` validation
([#713](#713)).
* Filtering out inventory database from loading into tables and
filtering out the same from grant detail view
([#705](#705)).
* Fixed documentation for `reflect_account_groups_on_workspace` task and
updated `CONTRIBUTING.md` guide
([#654](#654)).
* Fixed secret scope apply task to raise ValueError
([#683](#683)).
* Fixed legacy table ACL ownership migration and other integration
testing issues ([#722](#722)).
* Fixed some flaky integration tests
([#700](#700)).
* New CLI command for workspace mapping
([#678](#678)).
* Reduce server load for getting workspace groups and their members
([#666](#666)).
* Throwing ManyError on migrate-groups tasks
([#710](#710)).
* Updated installation documentation to use Databricks CLI
([#686](#686)).

Dependency updates:

* Updated databricks-sdk requirement from ~=0.13.0 to ~=0.14.0
([#651](#651)).
* Updated databricks-sdk requirement from ~=0.14.0 to ~=0.15.0
([#687](#687)).
* Updated databricks-sdk requirement from ~=0.15.0 to ~=0.16.0
([#712](#712)).
FastLee pushed a commit that referenced this pull request Jan 19, 2024
…li (#715)

Description:
This change contains steps to validate external location defined in the
external tables with the ones already available in the unity external
location. It prints the count of tables that can be migrated for
external location already present. Also shares the list of external
location that needs to be created for the missing ones. It generates a
tf file

Changes:
labs.yml - to add the new command
cli.py - to add code for invoking the new command
locations.py - add a field which gives count of tables using the
external location. also added logic to identify duplicate for jdbc
connection objects, functionality to compare locations and print details
and generate tf file
updated test cases to locations
added new test cases for mapping
FastLee pushed a commit that referenced this pull request Jan 19, 2024
* Added `databricks labs ucx create-table-mapping` and `databricks labs
ucx manual-workspace-info` commands for CLI
([#682](#682)).
* Added `databricks labs ucx ensure-assessment-run` to CLI commands
([#708](#708)).
* Added `databricks labs ucx installations` command
([#679](#679)).
* Added `databricks labs ucx skip --schema ... --table ...` command to
mark table/schema for skipping in the table migration process
([#680](#680)).
* Added `databricks labs ucx validate-external-locations` command for
cli ([#715](#715)).
* Added capturing `ANY FILE` and `ANONYMOUS FUNCTION` grants
([#653](#653)).
* Added cluster override and handle case of write protected DBFS
([#610](#610)).
* Added cluster policy selector in the installer
([#655](#655)).
* Added detailed UCX pre-requisite information to README.md
([#689](#689)).
* Added interactive wizard for `databricks labs uninstall ucx` command
([#657](#657)).
* Added more granular error retry logic
([#704](#704)).
* Added parallel fetching of registered model identifiers to speed-up
assessment workflow
([#691](#691)).
* Added retry on workspace listing
([#659](#659)).
* Added support for mapping workspace group to account group by
prefix/suffix/regex/external id
([#650](#650)).
* Changed cluster security mode from NONE to LEGACY_SINGLE_USER, as
`crawl_tables` was failing when run on non-UC Workspace in No Isolation
mode with unable to access the config file
([#661](#661)).
* Changed the fields of the table "Tables" to lower case
([#684](#684)).
* Enabled integration tests for `EXTERNAL` table migrations
([#677](#677)).
* Enforced `mypy` validation
([#713](#713)).
* Filtering out inventory database from loading into tables and
filtering out the same from grant detail view
([#705](#705)).
* Fixed documentation for `reflect_account_groups_on_workspace` task and
updated `CONTRIBUTING.md` guide
([#654](#654)).
* Fixed secret scope apply task to raise ValueError
([#683](#683)).
* Fixed legacy table ACL ownership migration and other integration
testing issues ([#722](#722)).
* Fixed some flaky integration tests
([#700](#700)).
* New CLI command for workspace mapping
([#678](#678)).
* Reduce server load for getting workspace groups and their members
([#666](#666)).
* Throwing ManyError on migrate-groups tasks
([#710](#710)).
* Updated installation documentation to use Databricks CLI
([#686](#686)).

Dependency updates:

* Updated databricks-sdk requirement from ~=0.13.0 to ~=0.14.0
([#651](#651)).
* Updated databricks-sdk requirement from ~=0.14.0 to ~=0.15.0
([#687](#687)).
* Updated databricks-sdk requirement from ~=0.15.0 to ~=0.16.0
([#712](#712)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create EXTERNAL LOCATIONs to map to External Tables (Azure)

2 participants