Added w.config.account_host to get the relevant account host from a workspace client#390
Added w.config.account_host to get the relevant account host from a workspace client#390
w.config.account_host to get the relevant account host from a workspace client#390Conversation
Codecov ReportAll modified lines are covered by tests ✅
📢 Thoughts on this report? Let us know!. |
databricks/sdk/azure.py
Outdated
| active_directory_endpoint="https://login.microsoftonline.de/"), | ||
| GERMAN=AzureEnvironment( | ||
| name="AzureGermanCloud", | ||
| databricks_hostname_suffix='.databricks.azure.de', # TODO: i'm not sure about this one |
There was a problem hiding this comment.
I think this is OK because we don't run Databricks in this cloud (and it is being deprecated: https://learn.microsoft.com/en-us/previous-versions/azure/germany/germany-welcome)
| return self.host.startswith("https://accounts.") or self.host.startswith("https://accounts-dod.") | ||
|
|
||
| @property | ||
| def account_host(self) -> str: |
There was a problem hiding this comment.
This won't work for dev/staging, right? We occasionally get internal issues where the SDK doesn't work well with dev/staging.
Also, can we label this somehow as experimental? In the future, if we introduce a workspace-level metadata service, the accounts endpoint can be exposed there, and we can remove this property.
There was a problem hiding this comment.
can you suggest on the pattern here to make it work for dev/staging?
btw, once we introduce workspace-level metadata service, we should just print warnings from within this call.
There was a problem hiding this comment.
One idea: have a list to lookup cloud/env/account host name based on domain suffix, from most-specific to least-specific:
# Most specific to least specific
[
('.dev.databricks.com', 'AWS', 'DEV', 'PUBLIC', 'accounts.dev.databricks.com'),
('.staging.cloud.databricks.com', 'AWS', 'STAGING', 'PUBLIC', 'accounts.staging.cloud.databricks.com'),
('.cloud.databricks.com', 'AWS', 'PROD', 'PUBLIC', 'accounts.cloud.databricks.com'),
('.cloud.databricks.us', 'AWS', 'PROD', 'GOV', 'accounts.cloud.databricks.us'),
('.dev.azuredatabricks.net', 'AZURE', 'DEV', 'PUBLIC', 'accounts.dev.azuredatabricks.net')
('.staging.azuredatabricks.net', 'AZURE', 'STAGING', 'PUBLIC', 'accounts.staging.azuredatabricks.net'),
('.azuredatabricks.net', 'AZURE', 'PROD', 'PUBLIC', 'accounts.azuredatabricks.net'),
('.databricks.azure.us', 'AZURE', 'PROD', 'GOV', 'accounts.databricks.azure.us'),
('.dev.gcp.databricks.com', 'GCP', 'DEV', 'PUBLIC', 'accounts.dev.gcp.databricks.com'),
('.staging.gcp.databricks.com', 'GCP', 'STAGING', 'PUBLIC', 'accounts.staging.gcp.databricks.com'),
('.gcp.databricks.com', 'GCP', 'PROD', 'PUBLIC', 'accounts.gcp.databricks.com'),
]
The simpler version is to just replace the most specific zone with accounts, which works everywhere today.
There was a problem hiding this comment.
@mgyucht do we want to expose dev/staging?... if we do - then why don't we expose the azure_login_app_id along with it?
There was a problem hiding this comment.
That's definitely fine by me!
|
One nit to make this work for dev/staging & label as experimental, but otherwise this seems reasonable to me if it unblocks you/customers. |
|
all the info here is fine to be public ( and already public) |
mgyucht
left a comment
There was a problem hiding this comment.
This is great, thank you for contributing! Couple nits/suggestions, otherwise this looks good to me.
| return self.host.startswith("https://accounts.") or self.host.startswith("https://accounts-dod.") | ||
|
|
||
| @property | ||
| def account_host(self) -> str: |
There was a problem hiding this comment.
That's definitely fine by me!
databricks/sdk/core.py
Outdated
| cloud: Cloud | ||
|
|
||
| # domain suffixes are not very secret: https://crt.sh/?q=databricks | ||
| tld: str |
There was a problem hiding this comment.
nit: let's call this dns_zone (I think TLD would traditionally refer to com. or net.)
databricks/sdk/core.py
Outdated
| def deployment(self, deployment_name: str) -> str: | ||
| return f'https://{deployment_name}{self.tld}' |
There was a problem hiding this comment.
This is maybe just a bit abstract. It won't work in Azure the way a user might expect, where the DNS name can't be determined from the workspace ID alone. Given we're only using this for account host, I suggest we just inline it there.
There was a problem hiding this comment.
it's already part of our response:
for now, we work it around in https://github.com/databrickslabs/ucx/blob/main/src/databricks/labs/ucx/account/workspaces.py#L101-L126, but @iaroslav-db fixed the mapping so that we don't have to work it around soon ;)
I need this function to create a workspace client from an account client. See:
- https://github.com/databrickslabs/watchdog/blob/main/scan/scan.go#L144-L154
- https://github.com/databrickslabs/ucx/blob/main/src/databricks/labs/ucx/account/workspaces.py#L122-L126
- https://github.com/databricks/terraform-provider-databricks/blob/baa61c451055271e14e7a9bc3db43537dcfda8ea/mws/resource_mws_workspaces.go#L372-L377
I'll do a follow-up PR to account client once azure changes are rolled out to prod
There was a problem hiding this comment.
Oh perfect, I actually didn't realize those workspace APIs worked in azure yet.
… workspace client This simplifies downstream integration testing as well as configuration introduce `DatabricksEnvironment` ..
bea774f to
cb830fc
Compare
|
this needs a bit more adjustments to work |
|
Are similar PRs for the other SDKs tracked some place? |
|
@pietern i'll have one for GoLang the following week - once i figure out the way to have it friendly for unit tests, need the same thing for two projects |
…ere IMDS doesn't give host environment information (#700) ## Changes This PR allows determining Azure Environment from a Databricks account or workspace hostname, removing the need for a separate configuration/environment variable and complexities related to Azure MSI from within ACR. Similar functionality in Python SDK: - databricks/databricks-sdk-py#390 Stacked on top of: - #699 ## Tests <!-- How is this tested? Please see the checklist below and also describe any other relevant tests --> - [x] `make test` passing - [x] `make fmt` applied - [ ] relevant integration tests applied
|
Thanks for this contribution. We're closing this PR as part of regular housekeeping since it has been inactive for over a year. This is not a reflection on the quality of the work -- if the changes are still relevant, feel free to re-open the PR or open a new one and we'll be happy to review it. |
Changes
This simplifies downstream integration testing as well as configuration
Tests
make testrun locallymake fmtapplied