[Authentication] Improve failure handling when token exchange fails [feature/ig4-authentication] #8612

rokatyy · 2025-09-12T08:42:17Z

📝 Description

Refresh Fallback Flow:

If the token refresh fails (i.e., fetch_token() raises an exception), retry the 1 and 2 steps up to 3 attempts:
Re-read the offline token from the configured file (to capture potential external updates such as rotation).
Retry exchanging the offline token at the token_endpoint.
If all refresh attempts fail, delegate handling to _post_fetch_hook:
3.1 If a cached access token is still valid (not fully expired), it will continue to be used.
3.2 If the token is expired, raise an error.

✅ Checklist

I updated the documentation (if applicable)
I have tested the changes in this PR

🧪 Testing

UT

🔗 References

Ticket link: https://iguazio.atlassian.net/browse/ML-10780
Design docs links:
External links:

🚨 Breaking Changes?

Yes (explain below)
No

liranbg

Few comments and suggestions below

mlrun/auth/providers.py

moranbental

Looks good, added some comments

mlrun/auth/providers.py

moranbental · 2025-09-16T13:11:18Z

mlrun/auth/providers.py

+        mlrun.utils.helpers.run_with_retry(
+            retry_count=self._max_retries,
+            func=mlrun.secrets.sync_secret_tokens,
+        )


sync tokens should be called last. If self._token is None, it will raise an error, meaning you end up syncing something unnecessary.

@moranbental not sure I agree here. In provider we work with a single default token, but this function syncs all of the tokens. So I think what user expects is that all the tokens are synced no matter what's wrong with the default one.

ok, so it’s worth adding a comment on that here

@moranbental the question is do we want to raise an exception if sync didn't succeed or not?

@rokatyy so I’m not sure we want to raise an exception here, since it could be spammy for the user.
@theSaarco / @quaark correct me if I'm wrong.

@moranbental also not sure, because how else user will know that something is wrong with the tokens sync? is there supposed to be any other flow where it will be synced?

user can call mlrun.sync_secret_tokens directly.

Spoke with @moranbental offline. I think there's been some confusion here. The sync tokens should not happen on every api call. Only when importing MLRun. Or if the user ran the method specifically

Make sense, removing it from Token provider

mlrun/auth/providers.py

moranbental

The only thing missing is calling sync tokens when importing MLRun, meaning in the token provider initialization, after refreshing the token if needed. But, this can be done in a different pr

…feature/ig4-authentication] (mlrun#8612) ### 📝 Description Refresh Fallback Flow: 0. If the token refresh fails (i.e., `fetch_token()` raises an exception), retry the 1 and 2 steps up to 3 attempts: 1. Re-read the offline token from the configured file (to capture potential external updates such as rotation). 2. Retry exchanging the offline token at the token_endpoint. 3. If all refresh attempts fail, delegate handling to _post_fetch_hook: 3.1 If a cached access token is still valid (not fully expired), it will continue to be used. 3.2 If the token is expired, raise an error. --- ### ✅ Checklist - [ ] I updated the documentation (if applicable) - [x] I have tested the changes in this PR --- ### 🧪 Testing UT --- ### 🔗 References - Ticket link: https://iguazio.atlassian.net/browse/ML-10780 - Design docs links: - External links: --- ### 🚨 Breaking Changes? - [ ] Yes (explain below) - [x] No

### 📝 Description   This PR introduces support for MLRun authentication with IG4. It rebases the `feature/ig4-authentication` branch onto `development` This PR includes the following PRs: 1. #8345 2. #8370 3. #8366 4. #8388 5. #8440 6. #8408 7. #8466 8. #8471 9. #8443 10. #8484 11. #8498 12. #8574 13. #8529 14. #8584 15. #8588 16. #8589 17. #8567 18. #8623 19. #8612 20. #8514 21. #8626 22. #8632 23. #8633 24. #8667 25. #8668 26. #8674 27. #8780 28. #8754 29. #8796 30. #8811 --- ### 🛠️ Changes Made  To enable IG4 project authorization, set the following configs in mlrun api: ``` MLRUN_HTTPDB__AUTHENTICATION__MODE: iguazio-v4 MLRUN_HTTPDB__AUTHENTICATION__IGUAZIO__SESSION_VERIFICATION_ENDPOINT: v1/identity/self MLRUN_IGUAZIO_API_URL: http://igz-api:8000 ``` Before importing MLRun, you must set: ``` MLRUN_AUTH_WITH_OAUTH_TOKEN__ENABLED=true MLRUN_AUTH_TOKEN_ENDPOINT="https://igz-api.<namespace>.<system-domain>/api/v1/refresh-access-token" ``` --- ### ✅ Checklist - [x] I updated the documentation (if applicable) - [x] I have tested the changes in this PR - [ ] If I introduced a deprecation: - [ ] I followed the [Deprecation Guidelines](./DEPRECATION.md) - [ ] I updated the relevant Jira ticket for documentation --- ### 🧪 Testing   Tested on IG4 system + unit tests --- ### 🔗 References - Ticket link: https://iguazio.atlassian.net/browse/ML-9683, https://iguazio.atlassian.net/browse/ML-9870, https://iguazio.atlassian.net/browse/ML-9998 - Design docs links: https://iguazio.atlassian.net/wiki/spaces/MLRUN/pages/399179866/Support+IG4+Authentication+in+MLRun+AuthVerifier+HLD, https://iguazio.atlassian.net/wiki/spaces/MLRUN/pages/411960071/Support+sdk-side+IG4+authentication+-+token+usage+and+management+HLD, https://iguazio.atlassian.net/wiki/spaces/MLRUN/pages/404521061/BE+Secret+Token+Support+HLD, - External links: https://iguazio.atlassian.net/wiki/spaces/ARC/pages/361103361/MLRun+Secret+Tokens+in+IG4 --- ### 🚨 Breaking Changes? - [x] Yes (explain below) - [] No Removed unused API endpoints `- POST /api/v1/user-secrets` which was not in used --- ### 🔍️ Additional Notes How to enable IG4 authentication - https://iguazio.atlassian.net/wiki/spaces/PLAT/pages/457671097/Enable+IG4+Authentication+in+MLRun --------- Co-authored-by: Katerina Molchanova <[email protected]> Co-authored-by: Amit Elbaz <[email protected]>

rokatyy added 3 commits September 12, 2025 09:32

[Authentication] Improve failure handling when token exchange fails

6e8605c

fix

5848683

fix

fdf7199

rokatyy marked this pull request as ready for review September 12, 2025 11:50

rokatyy requested a review from liranbg as a code owner September 12, 2025 11:50

liranbg reviewed Sep 12, 2025

View reviewed changes

mlrun/auth/providers.py Outdated Show resolved Hide resolved

mlrun/auth/providers.py Outdated Show resolved Hide resolved

mlrun/auth/providers.py Outdated Show resolved Hide resolved

mlrun/auth/providers.py Outdated Show resolved Hide resolved

Comments

ad9399e

rokatyy requested a review from liranbg September 15, 2025 09:35

fix test (aligned max_retries params to run_with_retry specific)

def6640

moranbental changed the title ~~[Authentication] Improve failure handling when token exchange fails~~ [Authentication] Improve failure handling when token exchange fails [feature/ig4-authentication] Sep 16, 2025

liranbg approved these changes Sep 16, 2025

View reviewed changes

moranbental requested changes Sep 16, 2025

View reviewed changes

Moran CR

fad320a

rokatyy requested a review from moranbental September 16, 2025 13:21

moranbental requested changes Sep 16, 2025

View reviewed changes

mlrun/auth/providers.py Show resolved Hide resolved

update comment

8c4b8c2

rokatyy requested a review from moranbental September 16, 2025 13:34

moranbental approved these changes Sep 16, 2025

View reviewed changes

delete syncing

e49d1fa

moranbental reviewed Sep 16, 2025

View reviewed changes

rokatyy merged commit ecd7060 into mlrun:feature/ig4-authentication Sep 16, 2025
13 checks passed

moranbental mentioned this pull request Oct 16, 2025

[Authentication] Allow authentication against IG4 #8724

Merged

6 tasks

[Authentication] Improve failure handling when token exchange fails [feature/ig4-authentication] #8612

[Authentication] Improve failure handling when token exchange fails [feature/ig4-authentication] #8612

Uh oh!

Conversation

rokatyy commented Sep 12, 2025

📝 Description

✅ Checklist

🧪 Testing

🔗 References

🚨 Breaking Changes?

Uh oh!

liranbg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

moranbental left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

moranbental Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

rokatyy Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

moranbental Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

rokatyy Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

moranbental Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

rokatyy Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

moranbental Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quaark Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

rokatyy Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

moranbental left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rokatyy Sep 16, 2025 •

edited

Loading

moranbental Sep 16, 2025 •

edited

Loading