Skip to content

Revert "Codebase refactoring and cleanup "#2

Merged
davincios merged 1 commit intomainfrom
revert-1-feature/jan-21-experimentation
Jan 23, 2026
Merged

Revert "Codebase refactoring and cleanup "#2
davincios merged 1 commit intomainfrom
revert-1-feature/jan-21-experimentation

Conversation

@davincios
Copy link
Copy Markdown
Contributor

Reverts #1

@davincios davincios merged commit 3d119a1 into main Jan 23, 2026
@davincios davincios deleted the revert-1-feature/jan-21-experimentation branch January 24, 2026 14:14
VaibhavUpreti pushed a commit that referenced this pull request Mar 13, 2026
…imentation

Revert "Codebase refactoring and cleanup "
devankitjuneja added a commit to devankitjuneja/opensre that referenced this pull request Apr 13, 2026
chaosreload pushed a commit to chaosreload/opensre that referenced this pull request Apr 27, 2026
…ilder

Fixes Tracer-Cloud#969. Follow-up to Tracer-Cloud#724.

Problem
-------
PR Tracer-Cloud#724 let EKS be detected when the AWS integration is configured with
stored IAM user credentials (access_key_id + secret_access_key, no
role_arn), but build_k8s_clients still had no way to use those stored
credentials. After Tracer-Cloud#724 it supports:

  1. role_arn -> STS AssumeRole
  2. ambient botocore chain (env / shared config / instance profile / IRSA)

The stored integration credentials (persisted by resolve_integrations
and surfaced in _eks_int['credentials']) were silently dropped. On hosts
that happen to have ambient credentials pointing elsewhere, this picks
up the wrong principal; on hosts without ambient credentials the call
fails entirely, even though the integration has perfectly usable keys.

Fix (approach Tracer-Cloud#2, pre-approved by Greptile in PR Tracer-Cloud#724 review)
-------------------------------------------------------------
Forward the stored credentials end-to-end through the EKS pipeline:

1. detect_sources.py: set eks_params['credentials'] = _eks_int.get(
   'credentials') or None alongside the other EKS integration fields.

2. EKSListClustersTool._eks_creds (shared helper used by every EKS tool
   for its extract_params): include 'credentials' in the returned dict
   so it flows into each tool's kwargs.

3. Seven EKS tool functions accept a new `credentials` kwarg and forward
   it to build_k8s_clients (and surface it in their input_schema):
     EKSPodLogsTool, EKSListPodsTool, EKSEventsTool, EKSNodeHealthTool,
     EKSListDeploymentsTool, EKSDeploymentStatusTool,
     EKSListNamespacesTool.

4. eks_k8s_client.build_k8s_clients accepts an optional credentials
   kwarg. Resolution priority:
     1. explicit credentials kwarg (stored-integration, new)
     2. role_arn AssumeRole         (existing production path)
     3. ambient botocore chain      (Tracer-Cloud#724 fallback)

The two secondary fixes from Tracer-Cloud#724 are preserved (empty SessionToken
coerced to None, regional STS endpoint).

Addresses Greptile P1
---------------------
The initial commit only wired the two ends of the pipeline (detect_sources
and build_k8s_clients); Greptile's review of Tracer-Cloud#976 correctly pointed out
that the shared _eks_creds helper and the seven tool call sites in between
were never updated, leaving the new credentials branch unreachable. This
revision wires credentials through every intermediate layer so each of
the seven EKS tools passes credentials=credentials to build_k8s_clients.

Test plan
---------
- py_compile passes on all 10 changed files.
- ruff check + ruff format clean on all 10 changed files.
- Regression safety: the role_arn (AssumeRole) and ambient-botocore
  branches are unchanged; the new explicit branch only runs when the
  credentials kwarg is truthy and has access_key_id + secret_access_key.

Related
-------
- Fixes Tracer-Cloud#969.
- Follow-up to Tracer-Cloud#724.
- Greptile pre-approved this approach in
  Tracer-Cloud#724 (comment)
- Greptile P1 on Tracer-Cloud#976 that drove the helper+tools fixups:
  Tracer-Cloud#976 (comment)
0xDevNinja added a commit to 0xDevNinja/opensre that referenced this pull request Apr 27, 2026
…lients

Problem
-------
PR Tracer-Cloud#724 fixed the EKS source-detection gate so an AWS integration
configured with stored IAM user credentials (access_key_id +
secret_access_key, no role_arn) now activates the EKS branch in
detect_sources. The follow-up path build_k8s_clients still does not
know about those stored credentials.

eks_params already carries role_arn, external_id, cluster_names, etc.
but not credentials, and build_k8s_clients had no way to receive
them. As a result the function unconditionally called STS AssumeRole,
which fails for IAM-user-only integrations even though the credentials
are persisted on the integration record.

Fix (Approach Tracer-Cloud#2 from issue)
----------------------------
Forward the stored credentials through the existing eks_params
pipeline:

1. detect_sources sets eks_params["credentials"] = _eks_int.get(
   "credentials") or None alongside the other EKS integration fields.
2. _eks_creds (the shared extract_params helper used by every k8s-API
   EKS tool) surfaces the credentials field.
3. The seven k8s-API EKS tools accept an optional credentials kwarg
   and thread it into build_k8s_clients.
4. build_k8s_clients accepts an optional credentials kwarg. New
   resolution priority:

     1. explicit credentials kwarg (stored-integration, new)
     2. role_arn AssumeRole (existing production path)

   Empty SessionToken is coerced to None — botocore rejects "" but
   accepts a missing token for IAM user credentials.

Risk / Backward Compatibility
-----------------------------
Strict superset of current behaviour. The new explicit branch only
runs when credentials is truthy and carries both access_key_id and
secret_access_key. Tools that do not receive credentials behave
exactly as before (AssumeRole on the supplied role_arn).

Tests
-----
- tests/services/test_eks_k8s_client.py: direct coverage for
  _stored_credentials_to_aws_creds (full dict, empty session token,
  missing keys) and build_k8s_clients credential resolution priority
  (stored credentials skip AssumeRole; missing/incomplete fall back).
- tests/tools/test_eks_list_clusters_tool.py: _eks_creds exposes the
  credentials field (and defaults it to None when absent).
- tests/tools/test_eks_list_pods_tool.py: list_eks_pods threads the
  credentials kwarg into build_k8s_clients.
- tests/nodes/plan_actions/test_detect_sources.py: detect_sources
  forwards stored credentials onto eks_params and defaults to None
  for role_arn-only integrations.

Fixes Tracer-Cloud#969
muddlebee pushed a commit that referenced this pull request Apr 28, 2026
…ilder (#976)

* fix(eks): forward stored AWS integration credentials to k8s client builder

Fixes #969. Follow-up to #724.

Problem
-------
PR #724 let EKS be detected when the AWS integration is configured with
stored IAM user credentials (access_key_id + secret_access_key, no
role_arn), but build_k8s_clients still had no way to use those stored
credentials. After #724 it supports:

  1. role_arn -> STS AssumeRole
  2. ambient botocore chain (env / shared config / instance profile / IRSA)

The stored integration credentials (persisted by resolve_integrations
and surfaced in _eks_int['credentials']) were silently dropped. On hosts
that happen to have ambient credentials pointing elsewhere, this picks
up the wrong principal; on hosts without ambient credentials the call
fails entirely, even though the integration has perfectly usable keys.

Fix (approach #2, pre-approved by Greptile in PR #724 review)
-------------------------------------------------------------
Forward the stored credentials end-to-end through the EKS pipeline:

1. detect_sources.py: set eks_params['credentials'] = _eks_int.get(
   'credentials') or None alongside the other EKS integration fields.

2. EKSListClustersTool._eks_creds (shared helper used by every EKS tool
   for its extract_params): include 'credentials' in the returned dict
   so it flows into each tool's kwargs.

3. Seven EKS tool functions accept a new `credentials` kwarg and forward
   it to build_k8s_clients (and surface it in their input_schema):
     EKSPodLogsTool, EKSListPodsTool, EKSEventsTool, EKSNodeHealthTool,
     EKSListDeploymentsTool, EKSDeploymentStatusTool,
     EKSListNamespacesTool.

4. eks_k8s_client.build_k8s_clients accepts an optional credentials
   kwarg. Resolution priority:
     1. explicit credentials kwarg (stored-integration, new)
     2. role_arn AssumeRole         (existing production path)
     3. ambient botocore chain      (#724 fallback)

The two secondary fixes from #724 are preserved (empty SessionToken
coerced to None, regional STS endpoint).

Addresses Greptile P1
---------------------
The initial commit only wired the two ends of the pipeline (detect_sources
and build_k8s_clients); Greptile's review of #976 correctly pointed out
that the shared _eks_creds helper and the seven tool call sites in between
were never updated, leaving the new credentials branch unreachable. This
revision wires credentials through every intermediate layer so each of
the seven EKS tools passes credentials=credentials to build_k8s_clients.

Test plan
---------
- py_compile passes on all 10 changed files.
- ruff check + ruff format clean on all 10 changed files.
- Regression safety: the role_arn (AssumeRole) and ambient-botocore
  branches are unchanged; the new explicit branch only runs when the
  credentials kwarg is truthy and has access_key_id + secret_access_key.

Related
-------
- Fixes #969.
- Follow-up to #724.
- Greptile pre-approved this approach in
  #724 (comment)
- Greptile P1 on #976 that drove the helper+tools fixups:
  #976 (comment)

* chore: trigger Greptile re-review after end-to-end credentials wiring

The amend (ac2d841) wired credentials through _eks_creds helper + 7 EKS
tool call sites per Greptile P1 on the initial commit b03928d. Pushing
an empty commit to ensure Greptile re-reviews the new diff (10 files,
+114/-11 vs the original 2 files, +50/-4).

* chore: second re-trigger Greptile re-review

First re-trigger commit (61f15ed) did not produce a Greptile re-review after ~4h. Pushing another empty commit to kick Greptile again so the end-to-end credentials wiring (helper + 7 EKS tools in ac2d841) gets a fresh review.

* fix(eks): thread stored credentials through EKSClient-backed tools

PR #976 wired stored AWS-integration credentials end-to-end for the
seven k8s-API EKS tools that route through `build_k8s_clients`, but
missed the four EKS tools that still go through `EKSClient`:
`list_eks_clusters`, `describe_eks_cluster`, `describe_eks_addon`,
`get_eks_nodegroup_health`. Their `_eks_creds` helper already exposes
`credentials`, but each tool absorbed it via `**_kwargs` and never
forwarded it, so `EKSClient._build` unconditionally called
`sts.assume_role(RoleArn="", ...)` and raised `ParamValidationError`
for IAM-user-only integrations at the cluster-discovery step.

Greptile flagged this exact gap on the (now-closed) parallel PR #990.
This commit closes it on #976.

Changes
-------
- `EKSClient.__init__` takes an optional `credentials` kwarg. `_build`
  prefers stored IAM user keys over STS AssumeRole; if neither is
  provided it raises `ValueError` (no silent fallthrough).
- New shared helper `_stored_credentials_to_aws_creds` normalises the
  stored dict into the AssumeRole-shaped triplet (empty `session_token`
  → None, both required keys present). `build_k8s_clients` now uses
  the same helper so both code paths stay in sync.
- The four EKSClient-backed tools accept `credentials` in the signature,
  expose it in `input_schema`, and pass it into the constructor.

Tests
-----
- New `tests/services/test_eks_client.py` covers the helper (full
  triplet / empty session token / missing required keys / None / empty
  dict) and EKSClient credential resolution (stored skips AssumeRole,
  empty session token coerced, partial dict falls back to AssumeRole,
  no creds + no role_arn raises ValueError, AssumeRole external_id
  passthrough stays intact).
- `test_eks_list_clusters_tool.py` grows two forwarding tests: the
  kwarg threads into EKSClient, and stays None by default for existing
  role-based callers.

Quality gate
------------
make lint / format-check / typecheck / test-cov all green
(3328 passed, 2 skipped, 1 xfailed).

Related
-------
- Extends #976
- Fixes #969
- Closes Greptile P1 on #990
  (#990 (comment))

* refactor(eks): address maintainer review feedback

Addresses all five inline comments from @muddlebee on #976:

1. **eks_client.py** — add `from __future__ import annotations`
   (project style convention).

2. **utils module, public helper** — move
   `_stored_credentials_to_aws_creds` out of `eks_client.py` into a
   new `app/services/eks/utils.py` and drop the leading underscore
   (`stored_credentials_to_aws_creds`). The import contract between
   `eks_client` and `eks_k8s_client` is now explicit public API
   rather than two services reaching into each other for a private
   helper.

3. **eks_k8s_client.py** — promote the `import botocore.session`
   that lived in the ambient-fallback branch to the top-level imports.
   No circular-import concern (`botocore` is already imported at
   module level).

4. **eks_k8s_client.py** — normalize `frozen.token or None` at the
   source instead of relying on the downstream `or None` guard. This
   matches the behaviour of `stored_credentials_to_aws_creds` and
   removes the hidden dependency.

5. **All 11 EKS tool files** — tighten `credentials` parameter type
   annotation from `dict | None` to `dict[str, Any] | None`, matching
   the service-layer style and keeping mypy strict mode happy.

The `tests/services/test_eks_client.py` import also switches to the
new public name and module path.

Quality gate
------------
make lint          → All checks passed!
make format-check  → 906 files already formatted
make typecheck     → Success: no issues found in 392 source files
make test-cov      → 3328 passed, 2 skipped, 1 xfailed in 43.36s

No behaviour changes — pure refactor of public surface + style.

---------

Co-authored-by: weichao <[email protected]>
Co-authored-by: chaosreload <[email protected]>
mazenessam77 pushed a commit to mazenessam77/opensre that referenced this pull request May 5, 2026
Adds six targeted tests that isolate each of the gaps the reviewer
flagged so any regression in one stage fails its own test rather than
bleeding into the end-to-end pipeline test:

Gap Tracer-Cloud#1 — load_env_integrations:
  test_load_env_integrations_skips_rds_when_db_id_missing

Gap Tracer-Cloud#2 — _classify_service_instance:
  test_classify_service_instance_rds_remote_store_returns_flat_shape
  test_classify_service_instance_rds_skips_when_db_id_missing

Gap Tracer-Cloud#3 — detect_sources:
  test_detect_sources_propagates_rds_into_sources
  test_detect_sources_skips_rds_when_not_in_resolved_integrations
  test_detect_sources_skips_rds_when_db_id_blank

The remote-store classifier test asserts the absence of a "credentials"
key in the returned dict, which is the exact failure mode the reviewer
called out (generic fallback nests fields and silently breaks
rds_is_available).

25 RDS tests pass; mypy, lint, format clean.
muddlebee pushed a commit that referenced this pull request May 5, 2026
…1287)

* feat(rds): add RDS integration with describe instance and events tools

- Add app/integrations/rds.py for source detection and param extraction
- Add RDSDescribeInstanceTool for instance status and configuration
- Add RDSEventsTool for failover, maintenance, and parameter events
- Use shared aws_sdk_client read-only allowlist for safety
- Add unit tests for integration helpers and both tools
- Add "rds" to EvidenceSource literal in app/types/evidence.py

Closes #125

* fix(rds): honor RDS_REGION env var in both config and param extraction

Greptile review caught two related issues in app/integrations/rds.py:

- rds_config_from_env: env_str("AWS_REGION", DEFAULT_RDS_REGION) always
  returned a non-empty string, making the RDS_REGION fallback unreachable
  (P1 dead code).
- rds_extract_params: only consulted AWS_REGION, never RDS_REGION,
  diverging from the config path (P2 inconsistency).

Both functions now resolve region as: source-supplied > AWS_REGION >
RDS_REGION > DEFAULT_RDS_REGION. Drops the now-unused os import.

Adds three tests covering the previously-uncovered fallback paths:
RDS_REGION-only in extract, RDS_REGION-only in config-from-env, and
default when neither env var is set.

* chore(rds): apply ruff lint and format fixes

- ruff check --fix reorganized imports in tests/tools/test_rds_tools.py
- ruff format reformatted app/integrations/rds.py

* docs(rds): add RDS integration documentation

- New docs/rds.mdx covering setup, env vars, IAM permissions, the two
  tools, use cases, and troubleshooting. References docs/aws.mdx for
  shared AWS credential setup rather than duplicating it.
- Register the page under "Data and workflow systems" in docs/docs.json.

Documents the RDS_REGION fallback semantics so users understand the
region resolution order: source dict > AWS_REGION > RDS_REGION > default.

* fix(rds): sanitize AWS SDK errors and warn on multiple instances

- Replace raw AWS error forwarding with generic messages in both
  RDSDescribeInstanceTool and RDSEventsTool; full errors are logged
  server-side only to prevent leaking account IDs, ARNs, or regions.
- Add warning log when DescribeDBInstances returns more than one
  instance, since only the first is used.
- Update test assertions to match the new sanitized error messages.

* test(rds): add coverage for multiple-instances warning path

Verifies that when DescribeDBInstances returns >1 record, the tool
uses the first instance and emits a warning log.

* fix(rds): wire RDS into env discovery, classification, and source detection

Reviewer found three real wiring gaps that meant the documented setup
path never reached the planner. The integration helpers were correct but
nothing else in the pipeline knew about RDS:

- app/integrations/_catalog_impl.py::load_env_integrations now
  registers an "rds" record when rds_config_from_env returns a
  configured instance. Previously env-set RDS_DB_INSTANCE_IDENTIFIER
  was a no-op.
- app/integrations/_catalog_impl.py::_classify_service_instance gains
  an "rds" branch that returns a flat shape ({db_instance_identifier,
  region, integration_id}) instead of falling through to the generic
  handler that nests fields under "credentials" — which made
  rds_is_available always return False for store-backed configs.
- app/nodes/plan_actions/detect_sources.py now copies the resolved
  rds integration into sources["rds"], mirroring the rabbitmq/mariadb
  blocks. Without this, even a correctly-resolved integration would
  not be visible to rds_is_available.

Adds tests/integrations/test_rds.py::test_rds_env_discovery_to_sources_pipeline
which walks the full env -> load_env_integrations -> _classify_service_instance
-> detect_sources path end to end and asserts each stage produces the
expected shape, so this regression cannot reappear silently.

* test(rds): add focused regression coverage for the three discovery gaps

Adds six targeted tests that isolate each of the gaps the reviewer
flagged so any regression in one stage fails its own test rather than
bleeding into the end-to-end pipeline test:

Gap #1 — load_env_integrations:
  test_load_env_integrations_skips_rds_when_db_id_missing

Gap #2 — _classify_service_instance:
  test_classify_service_instance_rds_remote_store_returns_flat_shape
  test_classify_service_instance_rds_skips_when_db_id_missing

Gap #3 — detect_sources:
  test_detect_sources_propagates_rds_into_sources
  test_detect_sources_skips_rds_when_not_in_resolved_integrations
  test_detect_sources_skips_rds_when_db_id_blank

The remote-store classifier test asserts the absence of a "credentials"
key in the returned dict, which is the exact failure mode the reviewer
called out (generic fallback nests fields and silently breaks
rds_is_available).

25 RDS tests pass; mypy, lint, format clean.

---------

Co-authored-by: code wizard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant