Skip to content

fix(sagemaker): honor aws_role_name/aws_session_name in embedding()#22663

Open
gustipardo wants to merge 2 commits intoBerriAI:mainfrom
gustipardo:fix/sagemaker-embedding-assumerole-22210
Open

fix(sagemaker): honor aws_role_name/aws_session_name in embedding()#22663
gustipardo wants to merge 2 commits intoBerriAI:mainfrom
gustipardo:fix/sagemaker-embedding-assumerole-22210

Conversation

@gustipardo
Copy link
Copy Markdown
Contributor

Summary

  • use _load_credentials(optional_params) in SageMaker embedding() before creating the runtime client
  • ensure aws_role_name / aws_session_name are passed into credential resolution (AssumeRole path)
  • keep auth params out of inference payload by relying on existing _load_credentials pop behavior
  • add a regression test validating role/session propagation and credentialed boto3.client(...) initialization
  • make the regression test portable by stubbing boto3 and botocore.credentials via sys.modules

Validation

  • venv/bin/pytest tests/test_litellm/llms/sagemaker/test_sagemaker_embedding_auth.py -q
  • result: 1 passed

Closes #22210

Use _load_credentials() in embedding() so aws_role_name/aws_session_name are honored and boto3 runtime client is created with assumed-role credentials. Add regression test for role/session propagation and credentialed client init.
Avoid hard dependency on local boto3/botocore installs by patching sys.modules for boto3 and botocore.credentials. Keeps regression test deterministic across CI and local environments.
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Error Error Mar 3, 2026 2:56pm

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 3, 2026

Greptile Summary

This PR fixes SageMaker embedding() to honor aws_role_name / aws_session_name by replacing the manual boto3 credential handling with the shared _load_credentials(optional_params) method — the same approach already used in completion(). This ensures that AssumeRole, web identity tokens, profiles, and other AWS auth strategies all work consistently for embeddings.

  • Unified auth path: The old embedding code only supported explicit access keys or env-var-based auth. The new code delegates to _load_credentials()get_credentials(), which supports all AWS auth strategies (role assumption, web identity, profiles, session tokens, env vars).
  • Auth params stripped from payload: _load_credentials pops all aws_* keys from optional_params, preventing them from leaking into the SageMaker inference payload — matching the existing completion behavior.
  • Regression test added: A mock-only test in tests/test_litellm/llms/sagemaker/ verifies that get_credentials receives aws_role_name / aws_session_name, the boto3 client is initialized with assumed credentials, and auth params are removed from optional_params.

Confidence Score: 5/5

  • This PR is safe to merge — it aligns embedding auth with the existing, well-tested completion auth path.
  • The change is minimal (replacing ~25 lines of manual credential handling with 6 lines using the existing _load_credentials method) and follows the exact same pattern already used by completion(). All auth strategies supported by _load_credentials are now available for embeddings. The regression test is thorough and mock-only. No risk of behavioral regression since _load_credentials handles all the same auth paths the old code handled, plus more (role assumption, web identity, profiles).
  • No files require special attention

Important Files Changed

Filename Overview
litellm/llms/sagemaker/completion/handler.py Replaces manual credential handling in embedding() with _load_credentials(optional_params), matching the pattern already used by completion(). No issues found.
tests/test_litellm/llms/sagemaker/test_sagemaker_embedding_auth.py New mock-only regression test verifying role assumption flows through to boto3 client initialization and auth params are stripped from optional_params. No real network calls.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant SagemakerLLM as SagemakerLLM.embedding()
    participant LoadCreds as _load_credentials()
    participant GetCreds as get_credentials()
    participant STS as AWS STS (AssumeRole)
    participant Boto3 as boto3.client()
    participant SM as SageMaker Runtime

    Caller->>SagemakerLLM: embedding(optional_params={aws_role_name, aws_session_name, ...})
    SagemakerLLM->>LoadCreds: _load_credentials(optional_params)
    LoadCreds->>LoadCreds: Pop aws_* params from optional_params
    LoadCreds->>GetCreds: get_credentials(aws_role_name, aws_session_name, ...)
    GetCreds->>STS: sts.assume_role(RoleArn, SessionName)
    STS-->>GetCreds: Assumed credentials (access_key, secret_key, token)
    GetCreds-->>LoadCreds: Credentials object + region
    LoadCreds-->>SagemakerLLM: (credentials, aws_region_name)
    SagemakerLLM->>Boto3: client(sagemaker-runtime, access_key, secret_key, token, region)
    Boto3-->>SagemakerLLM: sagemaker_client
    SagemakerLLM->>SM: invoke_endpoint(model, data)
    SM-->>SagemakerLLM: embedding response
    SagemakerLLM-->>Caller: EmbeddingResponse
Loading

Last reviewed commit: 0fd8794

@jymmi
Copy link
Copy Markdown
Contributor

jymmi commented Mar 3, 2026

Hey @gustipardo — just a heads up, I opened #20435 about a month ago which fixes the same issue (#22210) with the same approach (replacing the manual boto3 client creation with _load_credentials()). Might be worth consolidating so we don't duplicate effort!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: SageMaker embedding endpoint ignores aws_role_name — cross-account role assumption broken

2 participants