fix: overlapping keyword redaction and hardcoded investigation loop limit#520
Conversation
Greptile SummaryThis PR fixes two independent bugs: (1) the guardrails redaction engine could leave partial keywords unredacted when two keywords shared the same start position (e.g. Confidence Score: 5/5Safe to merge — both fixes are correct, well-tested, and narrowly scoped. Both bugs are clearly diagnosed and properly fixed. The sort key change in the redaction engine is minimal and provably correct. The constants extraction cleanly solves the circular-import problem. All remaining findings are P2 style/clarity notes that don't affect correctness or reliability. No files require special attention; the two P2 comments on app/pipeline/routing.py and app/nodes/root_cause_diagnosis/node.py are purely advisory. Important Files Changed
|
…imit Two bugs fixed: 1. Guardrails engine: overlapping keywords at the same start position (e.g. "secret" and "secret_key") could leave partial text unredacted. The sort key now includes match end position so the longest match is always processed first. 2. Root-cause diagnosis node: hardcoded `loop_count < 3` prevented recommendations from being generated on loops 3-4, even though MAX_INVESTIGATION_LOOPS=4 allows up to 4 iterations. Replaced the magic number with the shared constant.
Avoids circular import: app.pipeline.__init__ → graph → app.nodes → diagnosis node → app.pipeline.constants would fail because app.nodes is still partially initialized when graph.py tries to import it. Placing the constant in app/investigation_constants.py (outside both packages) breaks the cycle. Also fixes import ordering (ruff I001).
a4abcd5 to
fe8eec8
Compare
…fixes - 5 guardrails tests: overlapping keywords across rules, within rules, multiple occurrences, pattern+keyword overlap, full scan-apply cycle - 6 investigation loop tests: constant sharing, routing at all valid loop counts, diagnosis node at loop_count=3 (the previously broken case), full loop lifecycle from 0 to termination
The old max_investigation_loops: 3 was calibrated against the buggy hardcoded `loop_count < 3`. Now that the diagnosis node uses MAX_INVESTIGATION_LOOPS (4), investigations can legitimately run 4 loops. Updated all 14 answer.yml files accordingly.
|
Approved workflow to run :) |
Bring the overlapping-redaction fix branch forward onto main and keep the review-approved test cleanup so the PR is mergeable again. Made-with: Cursor
|
Thank you very much for your valuable contribution. |
…ive bookends ``GuardrailEngine.apply`` walked the sorted redact matches right-to-left with a single ``seen_end`` cursor and skipped any match whose end exceeded the cursor. When a wider match overlapped or contained an already-redacted narrower match, the wider match was silently dropped — leaving its prefix and suffix in the output. Concretely, with rules matching ``super_secret_token_value`` (wide) and ``secret_token`` (contained): input: "data super_secret_token_value end" before: "data super_[REDACTED:short]_value end" ← super_ and _value leak after: "data [REDACTED:long] end" The prior fix in Tracer-Cloud#520 only handled the same-start subcase by changing the secondary sort key; matches that started at different offsets but fully contained a narrower match still slipped through. Replace the single-cursor walk with a proper merge: 1. Sort redact matches by ``(start ASC, -end)`` so ties at the same offset keep the longest-keyword-wins behavior Tracer-Cloud#520 restored. 2. Sweep left-to-right, merging any match whose ``start`` falls before the current interval's ``end``. The representative rule for the merged interval is whichever contributing match had the largest original width, so the output reflects the most-specific rule that matched anywhere in the merged span. 3. Apply replacements right-to-left over the merged intervals so string indices stay valid as each redaction resizes the output. Adjacent matches (``A.end == B.start``) are intentionally not merged — they don't overlap. Disjoint matches produce independent redactions with intervening text preserved verbatim. Tests added in ``tests/test_guardrails/test_engine.py``: - contained-span union redaction (the regression from the bug report) - longest-rule-name semantics on a contained span - partial (non-containing) overlap produces one union redaction - disjoint matches stay separate, middle text preserved - adjacent matches stay separate (touching at one offset, not overlapping) - three-way transitive overlap chain collapses to one span - regex pattern + keyword on the same region: wider-wins applies across match kinds Existing coverage preserved: - ``test_overlapping_keyword_redaction_prefers_longest`` — same-start case - ``test_overlapping_keyword_same_rule_redaction`` — same-rule overlap - ``test_multiple_rules_on_same_span`` — audit + redact on same span Verification: - ``pytest``: 2774 pass, 1 skipped, 0 failures (was 2767 on main; +7 new) - ``ruff check`` / ``ruff format --check``: clean - ``mypy app/guardrails/``: clean on touched module - Full ``tests/test_guardrails/test_engine.py`` suite: 43 pass, 0 fail
…ive bookends ``GuardrailEngine.apply`` walked the sorted redact matches right-to-left with a single ``seen_end`` cursor and skipped any match whose end exceeded the cursor. When a wider match overlapped or contained an already-redacted narrower match, the wider match was silently dropped — leaving its prefix and suffix in the output. Concretely, with rules matching ``super_secret_token_value`` (wide) and ``secret_token`` (contained): input: "data super_secret_token_value end" before: "data super_[REDACTED:short]_value end" ← super_ and _value leak after: "data [REDACTED:long] end" The prior fix in Tracer-Cloud#520 only handled the same-start subcase by changing the secondary sort key; matches that started at different offsets but fully contained a narrower match still slipped through. Replace the single-cursor walk with a proper merge: 1. Sort redact matches by ``(start ASC, -end)`` so ties at the same offset keep the longest-keyword-wins behavior Tracer-Cloud#520 restored. 2. Sweep left-to-right, merging any match whose ``start`` falls before the current interval's ``end``. The representative rule for the merged interval is whichever contributing match had the largest original width, so the output reflects the most-specific rule that matched anywhere in the merged span. 3. Apply replacements right-to-left over the merged intervals so string indices stay valid as each redaction resizes the output. Adjacent matches (``A.end == B.start``) are intentionally not merged — they don't overlap. Disjoint matches produce independent redactions with intervening text preserved verbatim. Tests added in ``tests/test_guardrails/test_engine.py``: - contained-span union redaction (the regression from the bug report) - longest-rule-name semantics on a contained span - partial (non-containing) overlap produces one union redaction - disjoint matches stay separate, middle text preserved - adjacent matches stay separate (touching at one offset, not overlapping) - three-way transitive overlap chain collapses to one span - regex pattern + keyword on the same region: wider-wins applies across match kinds Existing coverage preserved: - ``test_overlapping_keyword_redaction_prefers_longest`` — same-start case - ``test_overlapping_keyword_same_rule_redaction`` — same-rule overlap - ``test_multiple_rules_on_same_span`` — audit + redact on same span Verification: - ``pytest``: 2774 pass, 1 skipped, 0 failures (was 2767 on main; +7 new) - ``ruff check`` / ``ruff format --check``: clean - ``mypy app/guardrails/``: clean on touched module - Full ``tests/test_guardrails/test_engine.py`` suite: 43 pass, 0 fail
…ive bookends (#780) * fix(guardrails): redact union of overlapping spans, never leak sensitive bookends ``GuardrailEngine.apply`` walked the sorted redact matches right-to-left with a single ``seen_end`` cursor and skipped any match whose end exceeded the cursor. When a wider match overlapped or contained an already-redacted narrower match, the wider match was silently dropped — leaving its prefix and suffix in the output. Concretely, with rules matching ``super_secret_token_value`` (wide) and ``secret_token`` (contained): input: "data super_secret_token_value end" before: "data super_[REDACTED:short]_value end" ← super_ and _value leak after: "data [REDACTED:long] end" The prior fix in #520 only handled the same-start subcase by changing the secondary sort key; matches that started at different offsets but fully contained a narrower match still slipped through. Replace the single-cursor walk with a proper merge: 1. Sort redact matches by ``(start ASC, -end)`` so ties at the same offset keep the longest-keyword-wins behavior #520 restored. 2. Sweep left-to-right, merging any match whose ``start`` falls before the current interval's ``end``. The representative rule for the merged interval is whichever contributing match had the largest original width, so the output reflects the most-specific rule that matched anywhere in the merged span. 3. Apply replacements right-to-left over the merged intervals so string indices stay valid as each redaction resizes the output. Adjacent matches (``A.end == B.start``) are intentionally not merged — they don't overlap. Disjoint matches produce independent redactions with intervening text preserved verbatim. Tests added in ``tests/test_guardrails/test_engine.py``: - contained-span union redaction (the regression from the bug report) - longest-rule-name semantics on a contained span - partial (non-containing) overlap produces one union redaction - disjoint matches stay separate, middle text preserved - adjacent matches stay separate (touching at one offset, not overlapping) - three-way transitive overlap chain collapses to one span - regex pattern + keyword on the same region: wider-wins applies across match kinds Existing coverage preserved: - ``test_overlapping_keyword_redaction_prefers_longest`` — same-start case - ``test_overlapping_keyword_same_rule_redaction`` — same-rule overlap - ``test_multiple_rules_on_same_span`` — audit + redact on same span Verification: - ``pytest``: 2774 pass, 1 skipped, 0 failures (was 2767 on main; +7 new) - ``ruff check`` / ``ruff format --check``: clean - ``mypy app/guardrails/``: clean on touched module - Full ``tests/test_guardrails/test_engine.py`` suite: 43 pass, 0 fail * test(guardrails): cover production-rule overlaps and audit-under-merge Extend the overlap regression tests with two cases grounded in the exact patterns shipped in _STARTER_CONFIG (app/guardrails/cli.py), and with an audit-behavior assertion that demonstrates the match-level audit record remains complete even when the output-level redactions are merged: - test_real_world_api_key_and_aws_access_key_overlap: config: api_key=AKIAIOSFODNN7EXAMPLE aws_access_key pattern matches the [16:36] span; generic_api_token pattern matches the [8:36] span (contains aws_access_key). Pre-fix output leaked "api_key=". Merged output: "config: [REDACTED:generic_api_token]". - test_real_world_aws_secret_key_contains_aws_access_key: export aws_secret_access_key=AKIA<16>abcdefghijklmnopqrst aws_access_key matches [29:49]; aws_secret_key matches [7:69] and contains it. Pre-fix output leaked the label and the 40-char tail. Merged output: "export [REDACTED:aws_secret_key]". - test_audit_logger_records_every_match_even_when_merged: Confirms the AuditLogger still emits one entry per ScanMatch even when multiple matches merge into a single output redaction — a reviewer tracing audit -> output can still account for every rule that fired. Uses a real AuditLogger with a tmp_path JSONL. Verification: - pytest tests/test_guardrails/test_engine.py: 46 pass (was 43; +3) - pytest tests/: 2777 pass, 1 skipped, 0 failures - ruff check + format: clean * test(guardrails): assert merged redaction reaches every LLM call site Closes the gap between unit-level ``GuardrailEngine.apply`` tests and the four call sites that invoke it in the pipeline: - ``app/services/llm_client.py::LLMClient.invoke`` (Anthropic messages + system) - ``app/services/llm_client.py::BedrockLLMClient.invoke`` (same shape as Anthropic) - ``app/services/llm_client.py::OpenAILLMClient.invoke`` (OpenAI chat.completions) - ``app/nodes/chat.py::_apply_guardrails_to_messages`` (LangGraph chat node) Five new parametrized integration tests load a ruleset modeled on the shipped ``_STARTER_CONFIG`` — ``aws_access_key`` is a strict substring of ``generic_api_token`` when the token value is itself an AWS key — and assert that the downstream payload (captured by a stub that replaces the real API client) contains no fragment of either the label prefix or the key value: - ``test_anthropic_client_sends_merged_redaction``: ``LLMClient.invoke("Debug dump: api_key=AKIA... from config.yml")`` is captured at ``Anthropic.messages.create`` and asserted to carry a single ``[REDACTED:generic_api_token]`` span with no ``api_key=`` or ``AKIA`` fragment. - ``test_anthropic_system_prompt_also_redacted``: Same shape, but asserts the ``system`` kwarg (distinct code path from per-message ``content``) gets the merged treatment. - ``test_openai_client_sends_merged_redaction``: Same shape for ``OpenAILLMClient.invoke`` -> ``chat.completions.create``. - ``test_chat_node_emits_merged_redaction``: ``_apply_guardrails_to_messages`` on a LangGraph-style message list, asserts original is untouched (defensive copy) and the returned message has merged redaction. - ``test_contained_real_secret_fully_redacted_in_pipeline``: End-to-end regression guard. Asserts none of ``api_key=``, ``AKIA``, ``IOSFODNN``, ``7EXAMPLE`` appear anywhere in the downstream payload. Pre-fix main leaks ``api_key=`` and the surrounding characters. Verification: - ``pytest tests/``: 2782 pass (was 2777; +5 integration) - ``pytest tests/test_guardrails/``: 59 pass (46 engine + 13 integration) - ``ruff check`` / ``ruff format --check``: clean - ``mypy app/guardrails/``: clean on touched module * refactor(guardrails): use NamedTuple for merged-span records Address Greptile P2 readability suggestion on PR #780. The interval-merge loop in ``GuardrailEngine._redact`` previously read and wrote the merged list as raw 4-tuples accessed by positional index (``merged[-1][1]``, ``merged[-1][3]``, etc.), which obscured what each slot meant. Replace the raw tuple with a typed ``_MergedSpan`` NamedTuple carrying named fields (``start``, ``end``, ``rule_name``, ``width``) so the loop reads naturally: merged[-1].end # was merged[-1][1] prev.width # was prev_width _MergedSpan(start, ...) # was (start, ..., width) No behavior change — same algorithm, same outputs, same 91 tests pass. Verification: - ``pytest tests/test_guardrails/``: 91 pass, 0 fail (unchanged) - ``ruff check`` / ``ruff format --check``: clean - ``mypy app/guardrails/``: clean on touched module * refactor(guardrails): address PR #780 review nits — width naming, fixture dedupe, system-prompt assertion Three points from @muddlebee's 2026-04-28 review on PR #780: 1. ``_MergedSpan.width`` was ambiguous — it tracks the *original single ScanMatch* width used for representative-rule selection, not the merged span's ``end - start``. Renamed to ``source_width`` and expanded both the NamedTuple docstring and the ``_redact`` algorithm docstring to spell out the distinction (``A∩B then C`` chained- overlap example) so a future reader can't conflate the two. 2. ``_FakeMessages`` / ``_FakeClient`` and the OpenAI equivalents were duplicated verbatim across four tests in ``TestOverlappingRedactionReachesDownstream``. Extracted them into two module-level pytest fixtures, ``anthropic_capture`` and ``openai_capture``, each yielding ``(client, captured)``. Each test now reads as one line of setup + the actual assertion. The two shared fake-response factories ``_anthropic_fake_response`` and ``_openai_fake_response`` keep the response-shape literals in one place so a future Anthropic / OpenAI SDK change touches one location instead of four. 3. ``test_anthropic_system_prompt_also_redacted`` used ``captured.get("system") is not None`` which was both unclear in intent and silent on the more interesting failure mode (system role smuggled into the messages list rather than hoisted to the top-level ``system`` kwarg). Replaced with subscript (KeyError on absent key), a non-empty assertion, and a structural check that no entry in ``captured["messages"]`` carries ``role: system``. Fail mode is now obvious in pytest output. Verification: - ``pytest tests/test_guardrails/`` — 91 / 91 pass - Full unit suite — 3159 pass / 2 skipped / 1 xfailed - ``ruff check`` and ``ruff format --check`` clean on both files - ``mypy app/guardrails/`` clean (the 5 errors are pre-existing missing-stub warnings in ``app/integrations/`` unrelated to this PR)
Summary
"secret"and"secret_key") could leave partial text unredacted (e.g."[REDACTED:r1]_key=xyz"instead of"[REDACTED:r2]=xyz"). Fixed by adding match end position to the sort key so the longest match is always processed first.loop_count < 3prevented investigation recommendations from being generated on loops 3–4, even thoughMAX_INVESTIGATION_LOOPS = 4allows up to 4 iterations. Extracted the constant toapp/pipeline/constants.py(avoiding circular imports) and replaced both hardcoded checks.Test plan
test_overlapping_keyword_redaction_prefers_longest— verifies longest keyword wins across rulestest_overlapping_keyword_same_rule_redaction— verifies longest keyword wins within a single rulepytest tests/test_guardrails/test_engine.py)pytest tests/test_guardrails/ tests/tools/test_compaction.py)make test-cov,make lint,make typecheck) on environment with all deps installed