Skip to content

Esoteric Bug with regards to authentication via requests headers= argument being overwritten by .netrc #337

@ewengillies

Description

@ewengillies

Describe the bug

It's an odd one that set me back around a day: I hope to save people a similar amount of frustration.

When running a python model on an all purpose cluster, I would get this error:

10:54:28    Error getting status of cluster.
10:54:28     b'{"error_code": "403", "message": "Invalid access token."}'

despite putting a brand new token into the profiles.yml for the target corresponding all purpose cluster.

Turns out that the my .netrc file was to blame. It had an expired token for the same host:. Also turns out that the requests library overrides authentication methods passed through header= argument with the contents of the .netrc file, as stated here: https://requests.readthedocs.io/en/latest/user/authentication/#netrc-authentication

To be 100% clear, my .netrc file looked like this (replacing sensitive info below):

.netrc:

machine <my-workspace>.cloud.databricks.com
login token
password <expired_token>

My .profiles contained a target like:

all_purpose:
  type: databricks
  host: <my-workspace>.cloud.databricks.com
  http_path: <sql_path_for_all_purpose_cluster>
  catalog: <catalog>
  schema: <schema>
  token: <valid_token>
  threads: 1

Then the weird part:

Steps To Reproduce

Fairly easy to trigger:

Expected behavior

I'd expect the token provided in my profiles.yml to take precedence.

Screenshots and log output

Not needed IMO, already know the cause.

System information

Relevant versions are:

  • dbt-databricks==1.5.0
  • requests==2.28.1

Additional context

I would say the solution is to ensure the dbt-databricks code uses the auth argument instead. It seems a "bearer token" auth method doesn't come included in requests (which seems odd but okay). To this end there is a good SO solution to this: https://stackoverflow.com/a/58055668

I would also highlight that while this feels like it might only be a me problem, Databricks has lots of docs around using .netrc file, as linked before: Databricks API documentation. Doesn't seem too far-fetched someone else will fall into this trap, and its a hard one to get out of.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions