Skip to content

[ISSUE] Authentication fails with auth-type=azure-cli and managed identity from 0.30.0 #742

@gidiLuke

Description

@gidiLuke

We are encountering an issue with managed identities when using the Databricks SDK for Python. Specifically, the problem seems related to the handling of the tenant_id when retrieving the OAuth token using a managed identity account. It appears that the following PR, released in version 0.30.0, introduced the issue: Infer Azure tenant ID if not set (#638).

When retrieving the OAuth token with a managed identity (and possibly also with service principal authentication), the tenant_id should not be set by the az account get-access-token command. However, it currently is, leading to an authentication error (see stacktrace below).

Reproduction

The error is specific to the python sdk. In the databricks cli, the inference does not happen and authentication works.

To trigger the error, you can initialise, e.g. a WorkspaceClient on a ManagedIdentity authenticated VM, using the auth-type="azure-cli":

from databricks.sdk import WorkspaceClient
WorkspaceClient()

Expected behavior

The tenant_id should not be included in the az account get-access-token command when using managed identities for authentication, preventing the above error.

Is it a regression?

Regression: The authentication worked prior to 0.30.0 and is caused by the following change: Infer Azure tenant ID if not set (#638).

Debug Logs

subprocess.CalledProcessError: Command '['az', 'account', 'get-access-token', '--resource', '<databricks_application_id>', '--output', 'json', '--tenant', '<our tenant id>']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/databricks/sdk/credentials_provider.py", line 505, in azure_cli
    token_source = AzureCliTokenSource.for_resource(cfg, cfg.effective_azure_login_app_id)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/databricks/sdk/credentials_provider.py", line 483, in for_resource
    token_source.token()
  File "/usr/local/lib/python3.11/site-packages/databricks/sdk/oauth.py", line 183, in token
    self._token = self.refresh()
                  ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/databricks/sdk/credentials_provider.py", line 429, in refresh
    raise IOError(f'cannot get access token: {message}') from e
OSError: cannot get access token: ERROR: Tenant shouldn't be specified for managed identity account

Other Information

  • OS: [e.g., macOS, Ubuntu]
  • Version: [e.g., Databricks SDK 0.30.0, 0.31.0]

Might be related to issue #726 .

Workaround
As a temporary workaround, you can authenticate using a managed identity to obtain a temporary PAT:

az login --identity --user f6dbab5f-d722-48e6-9ddd-b87be9d568d2
export DATABRICKS_TOKEN=$(az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d --query accessToken -o tsv)

Optionally you can also set the auth-type to pat:
export DATABRICKS_AUTH_TYPE=pat

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions