fix(eks): forward stored AWS integration credentials into build_k8s_clients#990
fix(eks): forward stored AWS integration credentials into build_k8s_clients#9900xDevNinja wants to merge 6 commits intoTracer-Cloud:mainfrom
Conversation
Greptile SummaryThis PR plumbs stored IAM-user credentials (
Confidence Score: 4/5Not safe to merge as-is — All seven k8s-API tools are correctly fixed, the core
Important Files Changed
Sequence DiagramsequenceDiagram
participant DS as detect_sources
participant LC as list_eks_clusters
participant EKSCl as EKSClient
participant LT as list_eks_pods / 6 other k8s tools
participant BKC as build_k8s_clients
participant STS as STS AssumeRole
DS->>DS: eks_params[credentials] = stored creds or None
Note over LC,EKSCl: Gap: credentials discarded via **_kwargs
LC->>EKSCl: EKSClient(role_arn, external_id, region)
EKSCl->>STS: assume_role(RoleArn='') FAILS for IAM-user integrations
Note over LT,BKC: Fixed path
LT->>BKC: build_k8s_clients(..., credentials=creds)
alt stored credentials valid
BKC->>BKC: _stored_credentials_to_aws_creds(credentials)
else no/incomplete credentials
BKC->>STS: _assume_role(role_arn, external_id, ...)
end
|
|
Thanks for the catch — addressed in two follow-up commits:
Direct unit tests added in Local quality gate: |
Adds an optional `credentials` kwarg to `build_k8s_clients` and a `_stored_credentials_to_aws_creds` helper that converts the stored integration shape (`access_key_id` / `secret_access_key` / `session_token`) into the AssumeRole-shaped dict the rest of the module already consumes. Empty `session_token` is coerced to None — botocore rejects "" but accepts a missing token for IAM user keys. Resolution priority becomes: 1. explicit `credentials` kwarg (stored-integration, new) 2. `role_arn` AssumeRole (existing production path) The new branch only runs when `credentials` carries both required keys; missing/incomplete dicts fall through to AssumeRole. Existing callers that do not pass `credentials` are unaffected. Direct unit tests cover the helper (full dict, empty/missing session token, missing required keys) and the resolution priority in `build_k8s_clients` (stored credentials skip AssumeRole; missing or incomplete credentials fall back to AssumeRole). Refs Tracer-Cloud#969
Surfaces a `credentials` field in `_eks_creds` (the shared extract_params helper used by every k8s-API EKS tool) and threads an optional `credentials` kwarg through the seven tools that call `build_k8s_clients`: - list_eks_pods - get_eks_pod_logs - get_eks_deployment_status - get_eks_node_health - get_eks_events - list_eks_deployments - list_eks_namespaces Each tool now forwards `credentials` to `build_k8s_clients`, making the new explicit-credentials resolution path reachable end-to-end from the planner. Tools that do not receive credentials behave exactly as before (AssumeRole on the supplied role_arn). Tests cover that `_eks_creds` exposes the credentials field (and defaults it to None when the integration record has no stored keys) and that `list_eks_pods` threads the kwarg into `build_k8s_clients`. Refs Tracer-Cloud#969
Sets `eks_params["credentials"] = _eks_int.get("credentials") or None`
alongside the other EKS integration fields. This is the last link in
the chain that lets `build_k8s_clients` see stored IAM user keys when
the AWS integration was configured with `access_key_id` /
`secret_access_key` and no `role_arn`.
PR Tracer-Cloud#724 enabled the EKS branch in `detect_sources` for this
integration shape but did not forward the credentials themselves;
without this commit the stored keys are still silently dropped before
they reach `build_k8s_clients`.
Tests cover the populated and absent (`role_arn`-only) cases.
Fixes Tracer-Cloud#969
Mirrors the resolution-priority change made for `build_k8s_clients`: `EKSClient.__init__` now accepts an optional `credentials` kwarg and its `_build` method honours stored IAM user keys before falling back to STS `AssumeRole`. Without this change, the four EKS boto3-API tools that go through `EKSClient` (`list_eks_clusters`, `describe_eks_cluster`, `describe_eks_addon`, `eks_nodegroup_health`) still call `sts.assume_role(RoleArn="", ...)` for IAM-user-only integrations and fail with `ParamValidationError` — the very failure this PR set out to fix. `list_eks_clusters` is the EKS connection-verification path, so the bug effectively disables every EKS tool when it fires. The credentials helper is reused from `eks_k8s_client` to keep the normalisation rules (empty `session_token` → `None`, both required keys present) in one place. Refs Tracer-Cloud#969
Forwards the optional `credentials` kwarg from the four EKS tools that go through `EKSClient` (`list_eks_clusters`, `describe_eks_cluster`, `describe_eks_addon`, `get_eks_nodegroup_health`) into the constructor so the new stored-credentials resolution path is reachable end-to-end. `list_eks_clusters` is the connection-verification path for the EKS source — without this commit the previous fixes still leave EKS unusable for AWS integrations configured with IAM user keys, because `list_eks_clusters` would fail before any other EKS tool gets a chance to run. Tools that do not receive credentials behave exactly as before (AssumeRole on the supplied role_arn). Refs Tracer-Cloud#969
b02480b to
3386046
Compare
CodeQL (rule py/clear-text-logging-sensitive-data) flagged the stored credentials log message as logging sensitive data in clear text. Drop the AccessKeyId-prefix interpolation entirely; the bare "Using stored integration credentials" line is enough to confirm the priority-1 path was taken without leaking any portion of the key material. Refs Tracer-Cloud#969
its already WIP by diff author |
PR Tracer-Cloud#976 wired stored AWS-integration credentials end-to-end for the seven k8s-API EKS tools that route through `build_k8s_clients`, but missed the four EKS tools that still go through `EKSClient`: `list_eks_clusters`, `describe_eks_cluster`, `describe_eks_addon`, `get_eks_nodegroup_health`. Their `_eks_creds` helper already exposes `credentials`, but each tool absorbed it via `**_kwargs` and never forwarded it, so `EKSClient._build` unconditionally called `sts.assume_role(RoleArn="", ...)` and raised `ParamValidationError` for IAM-user-only integrations at the cluster-discovery step. Greptile flagged this exact gap on the (now-closed) parallel PR Tracer-Cloud#990. This commit closes it on Tracer-Cloud#976. Changes ------- - `EKSClient.__init__` takes an optional `credentials` kwarg. `_build` prefers stored IAM user keys over STS AssumeRole; if neither is provided it raises `ValueError` (no silent fallthrough). - New shared helper `_stored_credentials_to_aws_creds` normalises the stored dict into the AssumeRole-shaped triplet (empty `session_token` → None, both required keys present). `build_k8s_clients` now uses the same helper so both code paths stay in sync. - The four EKSClient-backed tools accept `credentials` in the signature, expose it in `input_schema`, and pass it into the constructor. Tests ----- - New `tests/services/test_eks_client.py` covers the helper (full triplet / empty session token / missing required keys / None / empty dict) and EKSClient credential resolution (stored skips AssumeRole, empty session token coerced, partial dict falls back to AssumeRole, no creds + no role_arn raises ValueError, AssumeRole external_id passthrough stays intact). - `test_eks_list_clusters_tool.py` grows two forwarding tests: the kwarg threads into EKSClient, and stays None by default for existing role-based callers. Quality gate ------------ make lint / format-check / typecheck / test-cov all green (3328 passed, 2 skipped, 1 xfailed). Related ------- - Extends Tracer-Cloud#976 - Fixes Tracer-Cloud#969 - Closes Greptile P1 on Tracer-Cloud#990 (Tracer-Cloud#990 (comment))
…ilder (#976) * fix(eks): forward stored AWS integration credentials to k8s client builder Fixes #969. Follow-up to #724. Problem ------- PR #724 let EKS be detected when the AWS integration is configured with stored IAM user credentials (access_key_id + secret_access_key, no role_arn), but build_k8s_clients still had no way to use those stored credentials. After #724 it supports: 1. role_arn -> STS AssumeRole 2. ambient botocore chain (env / shared config / instance profile / IRSA) The stored integration credentials (persisted by resolve_integrations and surfaced in _eks_int['credentials']) were silently dropped. On hosts that happen to have ambient credentials pointing elsewhere, this picks up the wrong principal; on hosts without ambient credentials the call fails entirely, even though the integration has perfectly usable keys. Fix (approach #2, pre-approved by Greptile in PR #724 review) ------------------------------------------------------------- Forward the stored credentials end-to-end through the EKS pipeline: 1. detect_sources.py: set eks_params['credentials'] = _eks_int.get( 'credentials') or None alongside the other EKS integration fields. 2. EKSListClustersTool._eks_creds (shared helper used by every EKS tool for its extract_params): include 'credentials' in the returned dict so it flows into each tool's kwargs. 3. Seven EKS tool functions accept a new `credentials` kwarg and forward it to build_k8s_clients (and surface it in their input_schema): EKSPodLogsTool, EKSListPodsTool, EKSEventsTool, EKSNodeHealthTool, EKSListDeploymentsTool, EKSDeploymentStatusTool, EKSListNamespacesTool. 4. eks_k8s_client.build_k8s_clients accepts an optional credentials kwarg. Resolution priority: 1. explicit credentials kwarg (stored-integration, new) 2. role_arn AssumeRole (existing production path) 3. ambient botocore chain (#724 fallback) The two secondary fixes from #724 are preserved (empty SessionToken coerced to None, regional STS endpoint). Addresses Greptile P1 --------------------- The initial commit only wired the two ends of the pipeline (detect_sources and build_k8s_clients); Greptile's review of #976 correctly pointed out that the shared _eks_creds helper and the seven tool call sites in between were never updated, leaving the new credentials branch unreachable. This revision wires credentials through every intermediate layer so each of the seven EKS tools passes credentials=credentials to build_k8s_clients. Test plan --------- - py_compile passes on all 10 changed files. - ruff check + ruff format clean on all 10 changed files. - Regression safety: the role_arn (AssumeRole) and ambient-botocore branches are unchanged; the new explicit branch only runs when the credentials kwarg is truthy and has access_key_id + secret_access_key. Related ------- - Fixes #969. - Follow-up to #724. - Greptile pre-approved this approach in #724 (comment) - Greptile P1 on #976 that drove the helper+tools fixups: #976 (comment) * chore: trigger Greptile re-review after end-to-end credentials wiring The amend (ac2d841) wired credentials through _eks_creds helper + 7 EKS tool call sites per Greptile P1 on the initial commit b03928d. Pushing an empty commit to ensure Greptile re-reviews the new diff (10 files, +114/-11 vs the original 2 files, +50/-4). * chore: second re-trigger Greptile re-review First re-trigger commit (61f15ed) did not produce a Greptile re-review after ~4h. Pushing another empty commit to kick Greptile again so the end-to-end credentials wiring (helper + 7 EKS tools in ac2d841) gets a fresh review. * fix(eks): thread stored credentials through EKSClient-backed tools PR #976 wired stored AWS-integration credentials end-to-end for the seven k8s-API EKS tools that route through `build_k8s_clients`, but missed the four EKS tools that still go through `EKSClient`: `list_eks_clusters`, `describe_eks_cluster`, `describe_eks_addon`, `get_eks_nodegroup_health`. Their `_eks_creds` helper already exposes `credentials`, but each tool absorbed it via `**_kwargs` and never forwarded it, so `EKSClient._build` unconditionally called `sts.assume_role(RoleArn="", ...)` and raised `ParamValidationError` for IAM-user-only integrations at the cluster-discovery step. Greptile flagged this exact gap on the (now-closed) parallel PR #990. This commit closes it on #976. Changes ------- - `EKSClient.__init__` takes an optional `credentials` kwarg. `_build` prefers stored IAM user keys over STS AssumeRole; if neither is provided it raises `ValueError` (no silent fallthrough). - New shared helper `_stored_credentials_to_aws_creds` normalises the stored dict into the AssumeRole-shaped triplet (empty `session_token` → None, both required keys present). `build_k8s_clients` now uses the same helper so both code paths stay in sync. - The four EKSClient-backed tools accept `credentials` in the signature, expose it in `input_schema`, and pass it into the constructor. Tests ----- - New `tests/services/test_eks_client.py` covers the helper (full triplet / empty session token / missing required keys / None / empty dict) and EKSClient credential resolution (stored skips AssumeRole, empty session token coerced, partial dict falls back to AssumeRole, no creds + no role_arn raises ValueError, AssumeRole external_id passthrough stays intact). - `test_eks_list_clusters_tool.py` grows two forwarding tests: the kwarg threads into EKSClient, and stays None by default for existing role-based callers. Quality gate ------------ make lint / format-check / typecheck / test-cov all green (3328 passed, 2 skipped, 1 xfailed). Related ------- - Extends #976 - Fixes #969 - Closes Greptile P1 on #990 (#990 (comment)) * refactor(eks): address maintainer review feedback Addresses all five inline comments from @muddlebee on #976: 1. **eks_client.py** — add `from __future__ import annotations` (project style convention). 2. **utils module, public helper** — move `_stored_credentials_to_aws_creds` out of `eks_client.py` into a new `app/services/eks/utils.py` and drop the leading underscore (`stored_credentials_to_aws_creds`). The import contract between `eks_client` and `eks_k8s_client` is now explicit public API rather than two services reaching into each other for a private helper. 3. **eks_k8s_client.py** — promote the `import botocore.session` that lived in the ambient-fallback branch to the top-level imports. No circular-import concern (`botocore` is already imported at module level). 4. **eks_k8s_client.py** — normalize `frozen.token or None` at the source instead of relying on the downstream `or None` guard. This matches the behaviour of `stored_credentials_to_aws_creds` and removes the hidden dependency. 5. **All 11 EKS tool files** — tighten `credentials` parameter type annotation from `dict | None` to `dict[str, Any] | None`, matching the service-layer style and keeping mypy strict mode happy. The `tests/services/test_eks_client.py` import also switches to the new public name and module path. Quality gate ------------ make lint → All checks passed! make format-check → 906 files already formatted make typecheck → Success: no issues found in 392 source files make test-cov → 3328 passed, 2 skipped, 1 xfailed in 43.36s No behaviour changes — pure refactor of public surface + style. --------- Co-authored-by: weichao <[email protected]> Co-authored-by: chaosreload <[email protected]>
Fixes #969
Describe the changes you have made in this PR -
Follow-up to #724. PR #724 made
detect_sourcesactivate the EKS branch when the AWS integration is configured with stored IAM user credentials (access_key_id+secret_access_key, norole_arn), but the credentials themselves were never forwarded intobuild_k8s_clients, so the function unconditionally called STSAssumeRoleand failed for IAM-user-only integrations.This PR forwards the stored credentials end-to-end via the existing
eks_paramspipeline (Approach #2 from the issue):app/services/eks/eks_k8s_client.py—build_k8s_clientsaccepts a new optionalcredentialskwarg. A small_stored_credentials_to_aws_credshelper converts the stored shape (access_key_id/secret_access_key/session_token) into the AssumeRole-shaped dict the rest of the module already consumes. Emptysession_tokenis coerced toNonebecause botocore rejects""but accepts a missing token for IAM user keys.app/tools/EKSListClustersTool/__init__.py+ the seven k8s-API EKS tools — the shared_eks_credshelper surfaces acredentialsfield, and each oflist_eks_pods,get_eks_pod_logs,get_eks_deployment_status,get_eks_node_health,get_eks_events,list_eks_deployments, andlist_eks_namespacesacceptscredentialsand forwards it tobuild_k8s_clients.app/nodes/plan_actions/detect_sources.py— setseks_params["credentials"] = _eks_int.get("credentials") or Nonealongside the other EKS integration fields.New resolution priority inside
build_k8s_clients:credentialskwarg (stored-integration, new)role_arnAssumeRole (existing production path)Strict superset of current behaviour — the new branch only activates when
credentialsis truthy and carries both required keys, so existing role-based deployments see no change.PR is split into three atomic commits matching the three layers above. Each commit alone keeps the tree green (
make lint format-check typecheck test-cov).Demo/Screenshot for feature changes and bug fixes -
Before this PR, with an IAM-user AWS integration (
access_key_id+secret_access_key, norole_arn):After this PR, with the same integration:
Local quality gate:
Code Understanding and AI Usage
Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?
If you used AI assistance:
Explain your implementation approach:
The problem is a missing piece of plumbing: PR #724 made
detect_sourcesrecognize the IAM-user integration shape, but the credentials never reachedbuild_k8s_clients. I traced the call graph fromdetect_sources→eks_params→_eks_creds(inEKSListClustersTool) → each EKS tool function →build_k8s_clients, and added a single optionalcredentialskwarg at every link in that chain.I considered an alternative — having
build_k8s_clientsreach back into the integration store to look up credentials by integration ID — but rejected it because every other integration (grafana,bitbucket,github,datadog) already follows the explicit-forwarding pattern fromresolve_integrationsoutwards, and the issue's ownWhy Approach #2section calls out the ambient-only alternative as silently fragile.The key new piece is
_stored_credentials_to_aws_creds: it normalises the stored dict into the sameAccessKeyId/SecretAccessKey/SessionTokenshape that_assume_rolealready returns, so the downstream code inbuild_k8s_clientsis unchanged. Coercing an emptysession_tokentoNoneis the subtle bit — botocore raises on emptySessionTokenbut accepts a missing one for IAM user credentials, and the integration store persists""rather than omitting the field.Edge cases covered by tests:
_stored_credentials_to_aws_creds.build_k8s_clientsskipping_assume_rolewhen credentials are valid, falling back to_assume_rolewhen credentials are absent, and falling back when credentials are partially populated (only one of access key / secret)._eks_credsexposing thecredentialsfield and defaulting it toNonewhen the integration has no stored keys.list_eks_pods) actually threading the kwarg intobuild_k8s_clients.detect_sourcespopulatingeks_params["credentials"]from a stored-credentials integration and leaving it asNoneforrole_arn-only integrations.Checklist before requesting a review