-
Notifications
You must be signed in to change notification settings - Fork 294
[Authentication] Improve failure handling when token exchange fails [feature/ig4-authentication] #8612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Authentication] Improve failure handling when token exchange fails [feature/ig4-authentication] #8612
Conversation
liranbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments and suggestions below
moranbental
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, added some comments
mlrun/auth/providers.py
Outdated
| mlrun.utils.helpers.run_with_retry( | ||
| retry_count=self._max_retries, | ||
| func=mlrun.secrets.sync_secret_tokens, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sync tokens should be called last. If self._token is None, it will raise an error, meaning you end up syncing something unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moranbental not sure I agree here. In provider we work with a single default token, but this function syncs all of the tokens. So I think what user expects is that all the tokens are synced no matter what's wrong with the default one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so it’s worth adding a comment on that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moranbental the question is do we want to raise an exception if sync didn't succeed or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rokatyy so I’m not sure we want to raise an exception here, since it could be spammy for the user.
@theSaarco / @quaark correct me if I'm wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moranbental also not sure, because how else user will know that something is wrong with the tokens sync? is there supposed to be any other flow where it will be synced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user can call mlrun.sync_secret_tokens directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke with @moranbental offline. I think there's been some confusion here. The sync tokens should not happen on every api call. Only when importing MLRun. Or if the user ran the method specifically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, removing it from Token provider
moranbental
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing missing is calling sync tokens when importing MLRun, meaning in the token provider initialization, after refreshing the token if needed. But, this can be done in a different pr
…feature/ig4-authentication] (mlrun#8612) ### 📝 Description Refresh Fallback Flow: 0. If the token refresh fails (i.e., `fetch_token()` raises an exception), retry the 1 and 2 steps up to 3 attempts: 1. Re-read the offline token from the configured file (to capture potential external updates such as rotation). 2. Retry exchanging the offline token at the token_endpoint. 3. If all refresh attempts fail, delegate handling to _post_fetch_hook: 3.1 If a cached access token is still valid (not fully expired), it will continue to be used. 3.2 If the token is expired, raise an error. --- ### ✅ Checklist - [ ] I updated the documentation (if applicable) - [x] I have tested the changes in this PR --- ### 🧪 Testing UT --- ### 🔗 References - Ticket link: https://iguazio.atlassian.net/browse/ML-10780 - Design docs links: - External links: --- ### 🚨 Breaking Changes? - [ ] Yes (explain below) - [x] No
### 📝 Description <!-- A short summary of what this PR does. --> <!-- Include any relevant context or background information. --> This PR introduces support for MLRun authentication with IG4. It rebases the `feature/ig4-authentication` branch onto `development` This PR includes the following PRs: 1. #8345 2. #8370 3. #8366 4. #8388 5. #8440 6. #8408 7. #8466 8. #8471 9. #8443 10. #8484 11. #8498 12. #8574 13. #8529 14. #8584 15. #8588 16. #8589 17. #8567 18. #8623 19. #8612 20. #8514 21. #8626 22. #8632 23. #8633 24. #8667 25. #8668 26. #8674 27. #8780 28. #8754 29. #8796 30. #8811 --- ### 🛠️ Changes Made <!-- - Key changes (e.g., added feature X, refactored Y, fixed Z) --> To enable IG4 project authorization, set the following configs in mlrun api: ``` MLRUN_HTTPDB__AUTHENTICATION__MODE: iguazio-v4 MLRUN_HTTPDB__AUTHENTICATION__IGUAZIO__SESSION_VERIFICATION_ENDPOINT: v1/identity/self MLRUN_IGUAZIO_API_URL: http://igz-api:8000 ``` Before importing MLRun, you must set: ``` MLRUN_AUTH_WITH_OAUTH_TOKEN__ENABLED=true MLRUN_AUTH_TOKEN_ENDPOINT="https://igz-api.<namespace>.<system-domain>/api/v1/refresh-access-token" ``` --- ### ✅ Checklist - [x] I updated the documentation (if applicable) - [x] I have tested the changes in this PR - [ ] If I introduced a deprecation: - [ ] I followed the [Deprecation Guidelines](./DEPRECATION.md) - [ ] I updated the relevant Jira ticket for documentation --- ### 🧪 Testing <!-- - How it was tested (unit tests, manual, integration) --> <!-- - Any special cases covered. --> Tested on IG4 system + unit tests --- ### 🔗 References - Ticket link: https://iguazio.atlassian.net/browse/ML-9683, https://iguazio.atlassian.net/browse/ML-9870, https://iguazio.atlassian.net/browse/ML-9998 - Design docs links: https://iguazio.atlassian.net/wiki/spaces/MLRUN/pages/399179866/Support+IG4+Authentication+in+MLRun+AuthVerifier+HLD, https://iguazio.atlassian.net/wiki/spaces/MLRUN/pages/411960071/Support+sdk-side+IG4+authentication+-+token+usage+and+management+HLD, https://iguazio.atlassian.net/wiki/spaces/MLRUN/pages/404521061/BE+Secret+Token+Support+HLD, - External links: https://iguazio.atlassian.net/wiki/spaces/ARC/pages/361103361/MLRun+Secret+Tokens+in+IG4 --- ### 🚨 Breaking Changes? - [x] Yes (explain below) - [] No Removed unused API endpoints `- POST /api/v1/user-secrets` which was not in used --- ### 🔍️ Additional Notes How to enable IG4 authentication - https://iguazio.atlassian.net/wiki/spaces/PLAT/pages/457671097/Enable+IG4+Authentication+in+MLRun --------- Co-authored-by: Katerina Molchanova <[email protected]> Co-authored-by: Amit Elbaz <[email protected]>
📝 Description
Refresh Fallback Flow:
fetch_token()raises an exception), retry the 1 and 2 steps up to 3 attempts:3.1 If a cached access token is still valid (not fully expired), it will continue to be used.
3.2 If the token is expired, raise an error.
✅ Checklist
🧪 Testing
UT
🔗 References
🚨 Breaking Changes?