Additional GCP authentication#3541
Additional GCP authentication#3541tustvold merged 4 commits intoapache:masterfrom winding-lines:additional-gcp-authentication
Conversation
|
I'm confused workload identity federation is different from workload identity? The ticket is for the former but this PR references the latter? I agree workload identity federation is highly complex, and may warrant a different approach, but I'm also not sure if this is something we need to support? Workload identity is for when accessing GCP resources from within GCP, workload identity federation is when accessing GCP resources from a non-GCP environment |
|
Thanks for your reply. I am not using workload identity in my daily work, I am just trying to help move forward a great project. Could you be more specific about what the library should support? I also looked in the azure and aws implementations but jt is still not clear to me. Many thanks! |
|
I think a good place to start would be just supporting workload identity (#3490) as documented here. This is very similar to the AWS InstanceCredentialProvider. We can then look to extend this to support application default credentials as a follow up |
|
Ok, so the intent is just to fetch the token from the local metadata server when running on GKE and similar. Thanks for the clarification 🙏🏻 |
|
@tustvold here is the code to implement
I got an instance on Google Cloud and tested this code. Looking forward to your feedback :) |
|
I intend to review this first thing tomorrow |
|
Great! I work on this mostly in the weekends, but of course I don't expect weekend reviewers. |
tustvold
left a comment
There was a problem hiding this comment.
This looks really good, mostly just some minor nits. I will also endeavour to give this a final test before merge
| } | ||
| } | ||
|
|
||
| /// A deserialized `service-account-********.json`-file. |
There was a problem hiding this comment.
This is moved into credentials
object_store/src/gcp/credential.rs
Outdated
| ) -> Result<TemporaryToken<String>> { | ||
| println!("fetching token from metadata server"); | ||
| const TOKEN_URL: &str = | ||
| "http://metadata/computeMetadata/v1/instance/service-accounts/default/token"; |
There was a problem hiding this comment.
| "http://metadata/computeMetadata/v1/instance/service-accounts/default/token"; | |
| "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"; |
There was a problem hiding this comment.
Google's documentation specifically has only the hostname without the google.internal domain. I was able to test with the hostname only. I see the value of using the full domain here but I prefer to stick with the documenation.
There was a problem hiding this comment.
understood, I think Google just have inconsistent documents. Both work for me.
ps: I was looking at https://cloud.google.com/kubernetes-engine/docs/concepts/workload-identity#metadata_server and it mentions the full hostname metadata.google.internal and IP 169.254.169.254. It also reference https://cloud.google.com/compute/docs/metadata/overview
I agree that https://cloud.google.com/docs/authentication/get-id-token#metadata-server only show metadata
(sorry, I can not close this comment, maybe no permission. Feel free to ignore)
|
I think I addressed the current round of feedback, let me know what you think :) Also tested with Application Default Credentials and Google cloud. |
tustvold
left a comment
There was a problem hiding this comment.
Just some minor nits and I think this is ready to go, thank you
|
Thanks again |
|
Thanks a lot @tustvold, looking forward to contributing more in the near future. |
|
Benchmark runs are scheduled for baseline = 73ce076 and contender = 42b2d55. 42b2d55 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
|
I was looking for this functionality just yesterday and saw this commit. Thank you @winding-lines and @tustvold for getting this out! |
* Implement authentication with instance and application credentials * Fix link in documentation * Address feedback * Instantiate InstanceCredentialsProvider client just once
Which issue does this PR close?
Closes apache/arrow-rs-object-store#199.
Rationale for this change
What changes are included in this PR?
This is WIP. I may be missing something, however workload identity federation is big. https://google.aip.dev/auth/4117. Definitely more than one of my weekends :) Is this what you had in mind @tustvold ?
I am happy to keep on pushing on this but I am wondering if we could maybe land the Application Default Credentials in the mean time because that would unblock some end-user scenarios.
Are there any user-facing changes?