Skip to content

add synthetic RDS postgres RCA suite#194

Merged
davincios merged 17 commits intomainfrom
feature/rds-synthetic-testing
Mar 31, 2026
Merged

add synthetic RDS postgres RCA suite#194
davincios merged 17 commits intomainfrom
feature/rds-synthetic-testing

Conversation

@davincios
Copy link
Copy Markdown
Contributor

Summary

  • add a bundled synthetic RDS PostgreSQL scenario suite with fixtures, loader utilities, and suite runner coverage
  • extend RCA keyword extraction and evidence rendering to understand RDS metrics, RDS events, and Performance Insights inputs
  • add a dedicated Makefile entrypoint for running the synthetic benchmark suite

Test plan

  • pytest tests/synthetic_testing -q
  • make lint
  • make typecheck

Made with Cursor

Add bundled synthetic RDS PostgreSQL scenarios and extend RCA evidence handling so the benchmark suite can exercise database-specific failure modes end to end.

Made-with: Cursor
Keep starter alert payload templates scoped to the CLI surface so the top-level app package stays focused on shared runtime logic.

Made-with: Cursor
Made-with: Cursor
Comment thread tests/synthetic_testing/schemas.py Fixed
Comment thread tests/synthetic_testing/schemas.py Fixed
@davincios davincios marked this pull request as draft March 30, 2026 23:51
Comment thread tests/synthetic_testing/mock_grafana/formatters.py Fixed
Resolved conflicts:
- Makefile: kept test-rds-synthetic target alongside main's test-rca-grafana
- app/cli/alert_templates_test.py: accepted main's deletion

Made-with: Cursor
Comment thread tests/synthetic_testing/mock_grafana/formatters.py Fixed
Comment thread app/cli/tests/discover.py Fixed
Comment thread tests/synthetic_testing/mock_grafana/backend.py Fixed
Comment thread tests/synthetic_testing/mock_grafana/backend.py Fixed
Comment thread tests/synthetic_testing/mock_grafana/backend.py Fixed
@davincios davincios marked this pull request as ready for review March 31, 2026 22:43
Comment thread app/cli/tests/discover.py Fixed
Comment thread tests/synthetic_testing/mock_grafana/backend.py Fixed
Comment thread tests/synthetic_testing/mock_grafana/backend.py Fixed
Comment thread tests/synthetic_testing/mock_grafana/backend.py Fixed
davincios and others added 4 commits April 1, 2026 00:19
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
@davincios davincios merged commit 9a1b5c3 into main Mar 31, 2026
8 checks passed
@davincios davincios deleted the feature/rds-synthetic-testing branch March 31, 2026 23:51
muddlebee pushed a commit that referenced this pull request Apr 30, 2026
…inary (#1078) (#1090)

* fix(cli): graceful degradation when bundled binary lacks repo data files (#1078)

`opensre tests` and `opensre tests list` crashed with `FileNotFoundError`
when run from the PyInstaller-bundled binary, because
`packaging/opensre.spec` only collects `app/` data files and excludes
the `tests/` tree, the Makefile, and `tests/e2e/rca/`. The discover
helpers in `app/cli/tests/discover.py` then ran `iterdir()` /
`read_text()` on paths that don't exist at runtime.

The reported crash was in `_discover_rds_synthetic_scenarios`
(introduced by 9a1b5c3 "add synthetic RDS postgres RCA suite (#194)"),
but the same class of bug is present in `discover_make_targets` —
`MAKEFILE_PATH.read_text()` raises identically in the bundled binary.
A third spot, `run_synthetic_suite` in `app/cli/commands/tests.py`,
imports `from tests.synthetic.rds_postgres.run_suite` and would surface
a raw `ModuleNotFoundError` traceback instead of an actionable message.

Changes:
- `_discover_rds_synthetic_scenarios`: early `is_dir()` guard, returns []
- `discover_make_targets`: early `is_file()` guard on the Makefile, returns []
- `discover_rca_files`: explicit `is_dir()` guard added for documentation
  + future-proofing (current `Path.glob` already returns empty on a
  missing parent on CPython, but the contract was implicit)
- `run_synthetic_suite`: catches `ModuleNotFoundError` from the
  `tests.synthetic` import and raises `OpenSREError` with a "run from a
  source checkout" suggestion instead of a raw traceback

Tests:
- `tests/cli/test_discover.py`: 5 new tests covering each missing-source
  case + a fully-degraded `load_test_catalog` regression + a
  source-checkout sanity test that proves the existence guards don't
  break the happy path
- `tests/cli/test_cli_inventory.py`: 1 new test simulating a bundled
  binary by patching `__import__` to deny `tests.synthetic.*` — confirms
  `opensre tests synthetic` exits non-zero with a clean message and no
  `Traceback` / `ModuleNotFoundError` reaches the user

Live verification: in-process Click invocation with REPO_ROOT,
MAKEFILE_PATH, and RCA_DIR all redirected to an empty tmp dir
(matches the PyInstaller bundle layout). 17/17 e2e checks pass:
`opensre tests list` exits 0 with empty output, `--json tests list`
produces a valid empty array, `tests synthetic` surfaces the structured
error. Full unit suite: 4151 pass / 2 skipped / 1 xfailed.

Closes #1078

* fix(cli): pre-check synthetic data dir + narrow ModuleNotFoundError + EC2 guards (#1078)

Live PyInstaller bundle verification revealed the previous fix was
incomplete: the ``tests.synthetic.rds_postgres`` Python package is
bundled transitively (the static analyzer follows the literal
``from tests.synthetic… import …`` statement in
``app/cli/commands/tests.py`` regardless of the spec's
``collect_submodules`` filter), but the per-scenario data files under
``tests/synthetic/rds_postgres/<scenario>/`` are NOT bundled. So
``opensre tests synthetic`` was still crashing in the actual binary —
now with a ``FileNotFoundError`` from inside the scenario loader,
which my ``except ModuleNotFoundError`` did not catch.

Plus an audit of the rest of ``app/`` turned up the same class of bug
in ``app/cli/commands/deploy.py:323,329`` — ``opensre deploy ec2``
and ``opensre deploy ec2 --down`` directly import from
``tests.deployment.ec2.infrastructure_sdk`` without a guard.

Changes:

- ``run_synthetic_suite`` now pre-checks that
  ``REPO_ROOT / tests / synthetic / rds_postgres`` exists *before*
  the import, so the structured error fires regardless of whether the
  Python package or the data dir is the missing piece.
- Both ``ModuleNotFoundError`` catches narrow on ``exc.name`` so an
  unrelated transitive missing dep (e.g. ``psycopg``, ``boto3``) bubbles
  up as the real cause instead of being mis-reported as "not bundled".
- ``deploy_ec2`` wraps both the destroy and deploy imports in the same
  ``try/except`` pattern, raising a structured ``OpenSREError`` with a
  ``pip install -e .`` suggestion. New helper
  ``_ec2_deploy_not_bundled_error`` keeps the message in one place.
- Suggestion text on both errors updated from "clone the repo" to
  ``pip install -e .`` so pipx/Homebrew users get an actionable path.
- Test helper ``_patch_discover_paths`` extracted to dedupe
  ``monkeypatch.setattr("app.cli.tests.discover.X", ...)`` repetition.

Test additions:

- ``test_rds_synthetic_returns_empty_when_dir_present_but_empty``
- ``test_rds_synthetic_skips_underscore_and_pycache_entries``
- ``test_rds_synthetic_enriches_display_name_from_scenario_yml``
- ``test_rds_synthetic_tolerates_malformed_scenario_yml``
- ``test_tests_synthetic_clean_error_when_data_dir_missing`` — covers
  the actual bundled-binary failure mode that live verification found
- ``test_tests_synthetic_unrelated_module_not_found_propagates`` —
  pins the narrow-catch contract: ``psycopg`` missing must not show
  the "not bundled" message
- ``test_deploy_ec2_clean_error_when_not_bundled`` (parametrized for
  both ``deploy ec2`` and ``deploy ec2 --down``)
- ``test_deploy_ec2_unrelated_module_not_found_propagates``
- Existing ``test_tests_synthetic_clean_error_when_module_not_bundled``
  tightened: ``exit_code == 1``, asserts contractual error strings.

Live PyInstaller bundle verification (rebuild + run from ``dist/opensre``):

- ``./dist/opensre tests list`` → exit 0, empty stdout
- ``./dist/opensre --json tests list`` → exit 0, ``[]``
- ``./dist/opensre tests synthetic --scenario 001-replication-lag``
  → exit 1, structured error: "The synthetic RDS PostgreSQL suite is
  not available in this build" + ``pip install -e .`` suggestion. No
  ``Traceback`` / ``FileNotFoundError`` / ``ModuleNotFoundError``
  reaches the user. ← this is the failure mode the previous fix missed.

Full unit suite: 4160 pass / 2 skipped / 1 xfailed (no regressions).
``ruff check``, ``ruff format --check`` clean. ``mypy`` clean on touched
modules (the 3 errors mypy reports are pre-existing missing-stub
warnings for ``pymysql`` in ``app/integrations/`` — unrelated).

* test(cli): tighten ``tmp_path`` annotations to ``Path`` (#1078 review nit)

Per @Ghraven's PR #1090 review — three of the new bundled-binary tests
had ``tmp_path: object`` (one) or no annotation (two), which loses
the actual ``pathlib.Path`` type and stops type-checkers from inferring
``.mkdir()`` / ``/`` operator on the parameter. Added the
``from pathlib import Path`` import and tightened all three
signatures to ``tmp_path: Path``.

* refactor(cli): address muddlebee's PR #1090 review nits

Three nits from @muddlebee's 2026-04-30 review:

1. Path duplication on ``app/cli/commands/tests.py:125`` — the
   ``REPO_ROOT / "tests" / "synthetic" / "rds_postgres"`` literal was
   computed in both ``run_synthetic_suite`` and
   ``_discover_rds_synthetic_scenarios``. Extracted a
   ``SYNTHETIC_SCENARIOS_DIR`` constant in ``app/cli/tests/discover.py``
   alongside ``MAKEFILE_PATH`` and ``RCA_DIR``; both call sites now
   import it. Single source of truth if the layout ever changes.

2. ``_synthetic_suite_not_bundled_error`` was defined *after* its two
   call sites in ``run_synthetic_suite``. Moved above for forward-
   declaration consistency with how ``_ec2_deploy_not_bundled_error``
   sits above ``deploy_ec2`` in ``deploy.py``.

3. ``test_deploy_ec2_unrelated_module_not_found_propagates`` only
   exercised the ``--down`` (destroy_remote) path. The non-``--down``
   (deploy_remote) path has its own try/except in ``deploy_ec2`` but
   no parallel test. Parametrized the test over both ``argv`` and the
   patched module name so a regression in either branch surfaces
   independently.

Test plumbing: ``_patch_discover_paths`` helper extended with a
``synthetic_dir`` kwarg so tests can patch ``SYNTHETIC_SCENARIOS_DIR``
directly (the new constant is bound at module load, so patching
``REPO_ROOT`` after import doesn't propagate). Existing tests updated
to use the new kwarg; semantics unchanged.

Verification:

- ``pytest tests/cli/test_discover.py tests/cli/test_cli_inventory.py``:
  35 / 35 pass (was 34, +1 from the new ``deploy_remote`` parametrize).
- Full unit suite: 4161 pass / 2 skipped / 1 xfailed (was 4160).
- ``ruff check`` and ``ruff format --check`` clean.
gitsofaryan pushed a commit to gitsofaryan/opensre that referenced this pull request May 3, 2026
…inary (Tracer-Cloud#1078) (Tracer-Cloud#1090)

* fix(cli): graceful degradation when bundled binary lacks repo data files (Tracer-Cloud#1078)

`opensre tests` and `opensre tests list` crashed with `FileNotFoundError`
when run from the PyInstaller-bundled binary, because
`packaging/opensre.spec` only collects `app/` data files and excludes
the `tests/` tree, the Makefile, and `tests/e2e/rca/`. The discover
helpers in `app/cli/tests/discover.py` then ran `iterdir()` /
`read_text()` on paths that don't exist at runtime.

The reported crash was in `_discover_rds_synthetic_scenarios`
(introduced by 9a1b5c3 "add synthetic RDS postgres RCA suite (Tracer-Cloud#194)"),
but the same class of bug is present in `discover_make_targets` —
`MAKEFILE_PATH.read_text()` raises identically in the bundled binary.
A third spot, `run_synthetic_suite` in `app/cli/commands/tests.py`,
imports `from tests.synthetic.rds_postgres.run_suite` and would surface
a raw `ModuleNotFoundError` traceback instead of an actionable message.

Changes:
- `_discover_rds_synthetic_scenarios`: early `is_dir()` guard, returns []
- `discover_make_targets`: early `is_file()` guard on the Makefile, returns []
- `discover_rca_files`: explicit `is_dir()` guard added for documentation
  + future-proofing (current `Path.glob` already returns empty on a
  missing parent on CPython, but the contract was implicit)
- `run_synthetic_suite`: catches `ModuleNotFoundError` from the
  `tests.synthetic` import and raises `OpenSREError` with a "run from a
  source checkout" suggestion instead of a raw traceback

Tests:
- `tests/cli/test_discover.py`: 5 new tests covering each missing-source
  case + a fully-degraded `load_test_catalog` regression + a
  source-checkout sanity test that proves the existence guards don't
  break the happy path
- `tests/cli/test_cli_inventory.py`: 1 new test simulating a bundled
  binary by patching `__import__` to deny `tests.synthetic.*` — confirms
  `opensre tests synthetic` exits non-zero with a clean message and no
  `Traceback` / `ModuleNotFoundError` reaches the user

Live verification: in-process Click invocation with REPO_ROOT,
MAKEFILE_PATH, and RCA_DIR all redirected to an empty tmp dir
(matches the PyInstaller bundle layout). 17/17 e2e checks pass:
`opensre tests list` exits 0 with empty output, `--json tests list`
produces a valid empty array, `tests synthetic` surfaces the structured
error. Full unit suite: 4151 pass / 2 skipped / 1 xfailed.

Closes Tracer-Cloud#1078

* fix(cli): pre-check synthetic data dir + narrow ModuleNotFoundError + EC2 guards (Tracer-Cloud#1078)

Live PyInstaller bundle verification revealed the previous fix was
incomplete: the ``tests.synthetic.rds_postgres`` Python package is
bundled transitively (the static analyzer follows the literal
``from tests.synthetic… import …`` statement in
``app/cli/commands/tests.py`` regardless of the spec's
``collect_submodules`` filter), but the per-scenario data files under
``tests/synthetic/rds_postgres/<scenario>/`` are NOT bundled. So
``opensre tests synthetic`` was still crashing in the actual binary —
now with a ``FileNotFoundError`` from inside the scenario loader,
which my ``except ModuleNotFoundError`` did not catch.

Plus an audit of the rest of ``app/`` turned up the same class of bug
in ``app/cli/commands/deploy.py:323,329`` — ``opensre deploy ec2``
and ``opensre deploy ec2 --down`` directly import from
``tests.deployment.ec2.infrastructure_sdk`` without a guard.

Changes:

- ``run_synthetic_suite`` now pre-checks that
  ``REPO_ROOT / tests / synthetic / rds_postgres`` exists *before*
  the import, so the structured error fires regardless of whether the
  Python package or the data dir is the missing piece.
- Both ``ModuleNotFoundError`` catches narrow on ``exc.name`` so an
  unrelated transitive missing dep (e.g. ``psycopg``, ``boto3``) bubbles
  up as the real cause instead of being mis-reported as "not bundled".
- ``deploy_ec2`` wraps both the destroy and deploy imports in the same
  ``try/except`` pattern, raising a structured ``OpenSREError`` with a
  ``pip install -e .`` suggestion. New helper
  ``_ec2_deploy_not_bundled_error`` keeps the message in one place.
- Suggestion text on both errors updated from "clone the repo" to
  ``pip install -e .`` so pipx/Homebrew users get an actionable path.
- Test helper ``_patch_discover_paths`` extracted to dedupe
  ``monkeypatch.setattr("app.cli.tests.discover.X", ...)`` repetition.

Test additions:

- ``test_rds_synthetic_returns_empty_when_dir_present_but_empty``
- ``test_rds_synthetic_skips_underscore_and_pycache_entries``
- ``test_rds_synthetic_enriches_display_name_from_scenario_yml``
- ``test_rds_synthetic_tolerates_malformed_scenario_yml``
- ``test_tests_synthetic_clean_error_when_data_dir_missing`` — covers
  the actual bundled-binary failure mode that live verification found
- ``test_tests_synthetic_unrelated_module_not_found_propagates`` —
  pins the narrow-catch contract: ``psycopg`` missing must not show
  the "not bundled" message
- ``test_deploy_ec2_clean_error_when_not_bundled`` (parametrized for
  both ``deploy ec2`` and ``deploy ec2 --down``)
- ``test_deploy_ec2_unrelated_module_not_found_propagates``
- Existing ``test_tests_synthetic_clean_error_when_module_not_bundled``
  tightened: ``exit_code == 1``, asserts contractual error strings.

Live PyInstaller bundle verification (rebuild + run from ``dist/opensre``):

- ``./dist/opensre tests list`` → exit 0, empty stdout
- ``./dist/opensre --json tests list`` → exit 0, ``[]``
- ``./dist/opensre tests synthetic --scenario 001-replication-lag``
  → exit 1, structured error: "The synthetic RDS PostgreSQL suite is
  not available in this build" + ``pip install -e .`` suggestion. No
  ``Traceback`` / ``FileNotFoundError`` / ``ModuleNotFoundError``
  reaches the user. ← this is the failure mode the previous fix missed.

Full unit suite: 4160 pass / 2 skipped / 1 xfailed (no regressions).
``ruff check``, ``ruff format --check`` clean. ``mypy`` clean on touched
modules (the 3 errors mypy reports are pre-existing missing-stub
warnings for ``pymysql`` in ``app/integrations/`` — unrelated).

* test(cli): tighten ``tmp_path`` annotations to ``Path`` (Tracer-Cloud#1078 review nit)

Per @Ghraven's PR Tracer-Cloud#1090 review — three of the new bundled-binary tests
had ``tmp_path: object`` (one) or no annotation (two), which loses
the actual ``pathlib.Path`` type and stops type-checkers from inferring
``.mkdir()`` / ``/`` operator on the parameter. Added the
``from pathlib import Path`` import and tightened all three
signatures to ``tmp_path: Path``.

* refactor(cli): address muddlebee's PR Tracer-Cloud#1090 review nits

Three nits from @muddlebee's 2026-04-30 review:

1. Path duplication on ``app/cli/commands/tests.py:125`` — the
   ``REPO_ROOT / "tests" / "synthetic" / "rds_postgres"`` literal was
   computed in both ``run_synthetic_suite`` and
   ``_discover_rds_synthetic_scenarios``. Extracted a
   ``SYNTHETIC_SCENARIOS_DIR`` constant in ``app/cli/tests/discover.py``
   alongside ``MAKEFILE_PATH`` and ``RCA_DIR``; both call sites now
   import it. Single source of truth if the layout ever changes.

2. ``_synthetic_suite_not_bundled_error`` was defined *after* its two
   call sites in ``run_synthetic_suite``. Moved above for forward-
   declaration consistency with how ``_ec2_deploy_not_bundled_error``
   sits above ``deploy_ec2`` in ``deploy.py``.

3. ``test_deploy_ec2_unrelated_module_not_found_propagates`` only
   exercised the ``--down`` (destroy_remote) path. The non-``--down``
   (deploy_remote) path has its own try/except in ``deploy_ec2`` but
   no parallel test. Parametrized the test over both ``argv`` and the
   patched module name so a regression in either branch surfaces
   independently.

Test plumbing: ``_patch_discover_paths`` helper extended with a
``synthetic_dir`` kwarg so tests can patch ``SYNTHETIC_SCENARIOS_DIR``
directly (the new constant is bound at module load, so patching
``REPO_ROOT`` after import doesn't propagate). Existing tests updated
to use the new kwarg; semantics unchanged.

Verification:

- ``pytest tests/cli/test_discover.py tests/cli/test_cli_inventory.py``:
  35 / 35 pass (was 34, +1 from the new ``deploy_remote`` parametrize).
- Full unit suite: 4161 pass / 2 skipped / 1 xfailed (was 4160).
- ``ruff check`` and ``ruff format --check`` clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants