Skip to content

EKS k8s client silently ignores stored AWS integration credentials (follow-up to #724) #969

@chaosreload

Description

@chaosreload

Context

PR #724 fixed the EKS source-detection gate: when the AWS integration is configured with stored IAM user credentials (access_key_id + secret_access_key, no role_arn), the EKS branch in detect_sources now activates instead of being skipped.

However, the follow-up path (build_k8s_clients) still does not know about those stored credentials. After #724, build_k8s_clients resolves credentials in this order:

  1. role_arn → STS AssumeRole (existing)
  2. ambient botocore chain: env / shared config / instance profile / IRSA (added in fix(eks): detect EKS source when AWS integration uses IAM user credentials #724 as a fallback)

The stored IAM user credentials attached to the AWS integration (persisted by resolve_integrations and surfaced in _eks_int["credentials"]) are silently dropped. On hosts that also happen to have ambient credentials, the ambient chain will pick up unrelated credentials; on hosts without ambient credentials, the call fails entirely — even though the integration has perfectly usable IAM user keys right there.

Expected

When the AWS integration has stored credentials, build_k8s_clients should use them explicitly. This should be the highest-priority resolution path, ahead of role_arn AssumeRole and ambient botocore, because the stored credentials are what the integration was explicitly configured with.

Actual

The stored credentials are not forwarded from detect_sources to build_k8s_clients at all. eks_params carries role_arn, external_id, cluster_names, etc., but not credentials. build_k8s_clients has no credentials parameter to receive them.

Proposal (Approach #2)

Forward the stored credentials through the existing eks_params pipeline:

  1. app/nodes/plan_actions/detect_sources.py: set eks_params["credentials"] = _eks_int.get("credentials") or None alongside the other EKS integration fields.
  2. app/services/eks/eks_k8s_client.py::build_k8s_clients: accept an optional credentials kwarg. Resolution priority becomes:
    1. explicit credentials kwarg (stored-integration, new)
    2. role_arn AssumeRole (existing production path)
    3. ambient botocore chain (fix(eks): detect EKS source when AWS integration uses IAM user credentials #724 fallback, unchanged)

Both secondary fixes from #724 are preserved (empty SessionToken coerced to None, regional STS endpoint).

Why Approach #2

In the code review of #724, Greptile analyzed two candidate approaches for this exact scenario:

"Approach #2 is the right call for the follow-up. The ambient botocore resolver would silently miss stored-integration credentials exactly when the user is relying on them."

Ambient-only resolution would only work when the deployment environment happens to expose the same credentials twice (once in the integration, once in env / instance profile / etc.), which is fragile and not guaranteed. Explicit forwarding mirrors how every other integration (grafana, bitbucket, github, datadog) already uses the values that resolve_integrations persisted.

Regression safety

The role_arn (AssumeRole) and ambient-botocore branches are left unchanged. The new explicit branch only runs when credentials is truthy and has both access_key_id and secret_access_key, so existing deployments see no behaviour change.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions