fix(rocky-ai): credential redaction + testgen path canonicalization + token budget#287
Merged
hugocorreia90 merged 1 commit intomainfrom Apr 29, 2026
Merged
Conversation
Three issues from the rocky-ai maintenance sweep.
6a (HIGH/SEC): AiConfig stored api_key as a raw String while deriving
Debug + Serialize. Any future tracing call that included `?config` would
exfiltrate the raw key into log files. Wrap api_key in
rocky_core::redacted::RedactedString so `{:?}` prints `***`; expose the
raw value only at the outbound HTTP header. Add a regression test that
asserts Debug output never contains the raw secret.
6b (HIGH/SEC): testgen::save_tests built file paths from LLM-controlled
assertion.name without sanitizing `..`, `/`, `\\`, leading dots, or empty
strings — direct path traversal from model output to the local
filesystem. Validate both the model name and each assertion name against
a strict allow-list (ASCII alphanumeric + `_` + `-`, non-empty,
non-dotted) before constructing the output path. Reject with
io::ErrorKind::InvalidInput on any unsafe input. Add the requested
`assertion.name = "../../../etc/passwd"` regression test plus four
sibling tests for absolute paths, dotted names, unsafe model names, and
the happy path.
6c (MED/PERF + cost): max_tokens was hard-coded to 4096 and the
compile-verify retry loop had no global cost ceiling — three retries
could each return up to max_tokens, billing N x max_tokens before
failing. Introduce a new `[ai] max_tokens = N` knob in rocky.toml
(default 4096, preserving existing behaviour). Use it both as the
per-request cap on the Anthropic Messages API and as a cumulative
output-token budget tracked across retries; when the running total
exceeds the budget, fail-stop with a new TokenBudgetExceeded error
instead of issuing another retry. Document the knob in the AI section
of the configuration reference and the AI features guide.
Test plan:
- cargo test -p rocky-ai (26 passed, including the new path-traversal
and Debug-leak regression tests)
- cargo test -p rocky-core (1122 + sibling test suites passed; covers
the new AiSection field on RockyConfig)
- cargo test -p rocky-cli (296 passed)
- cargo clippy --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- just codegen (regenerated rocky_project schema, vscode TS types,
dagster Pydantic model)
- cd integrations/dagster && uv run pytest (458 passed)
5 tasks
hugocorreia90
added a commit
that referenced
this pull request
Apr 29, 2026
Engine 1.18.0 ships the rocky preview workflow end-to-end (#279, #280, #281, #282), the [budget].max_bytes_scanned threshold (#288), the audit-sweep closeout (#283, #285–#287, #290–#293), and the rocky-server auth + CORS gate (#291). Dagster 1.15.0 picks up the regenerated Pydantic models for the rocky preview surface and ships the P1 cluster (#289) + FR-014 follow-on (#284). VS Code 1.10.0 regenerates TypeScript bindings for rocky preview and RunCostSummary.total_bytes_scanned. See per-artifact CHANGELOG entries for the full breakdown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three issues from the rocky-ai maintenance sweep, landed together since they all touch the same crate and the third one cascades into
rocky-coreand the schema codegen.Issue 6a — credential redaction (HIGH/SEC)
AiConfigstoredapi_key: Stringwhile derivingDebug + Serialize. Any futuretracing::error!(?config)(ortracing::debug!(?self.config)insideLlmClient) would write the raw API key straight into log files.Fix: wrap
api_keyinrocky_core::redacted::RedactedString, which masksDebugandDisplayto***by design. The raw value is exposed only at the outbound HTTP boundary incall_anthropicviaRedactedString::expose(). Added adebug_does_not_leak_api_keyregression test that assertsformat!("{config:?}")never contains the secret and always contains***.Issue 6b — testgen path traversal (HIGH/SEC)
testgen::save_testsbuilt file paths by joiningtests_dirwithformat!("{model}_{name}.sql", name = assertion.name.replace(' ', "_").to_lowercase()). The LLM controlsassertion.nameend-to-end, and..///\slipped straight throughreplace(' ', "_")— direct path traversal from model output to the filesystem.Fix: introduced a strict allow-list (
is_safe_path_component) — ASCII alphanumeric,_,-, non-empty, no leading dot — and validated both the model name and every assertion name before anyPath::join. Unsafe input rejects withio::ErrorKind::InvalidInput. Added the requestedassertion.name = "../../../etc/passwd"regression test plus four siblings covering absolute paths, dotted names, unsafe model names, and the happy path.Issue 6c — max_tokens hardcoded + unbounded retry budget (MED/PERF + cost)
call_anthropichard-codedmax_tokens: 4096andgenerate_modelretried 3× with no aggregate token / cost / time ceiling. A pathological run could pay for3 × 4096output tokens before fail-stopping.Fix:
[ai] max_tokens = Nconfig block inrocky.toml(default4096, preserving prior behaviour) — backed by a newAiSectionstruct wired intoRockyConfig.AiConfig::max_tokensis sourced from that config block inmake_clientand used both as the per-request cap on the Anthropic Messages API and as the cumulative output-token budget across retries.generate_modelnow sumsresponse.output_tokensacross attempts; when the running total exceeds the budget, it fail-stops with a newAiError::TokenBudgetExceededvariant instead of issuing another retry.docs/src/content/docs/reference/configuration.md(new## [ai]section) and thedocs/src/content/docs/guides/ai-features.mdsetup guide.The Anthropic API caps each individual response at the per-request
max_tokens, so attempt 1 alone can never trip the budget. The fail-stop kicks in only on attempt ≥ 2 with runaway accumulation — preserving typical-case behaviour while bounding the worst case.Codegen cascade
Adding
AiSectiontoRockyConfigupdates the autogeneratedrocky_project.schema.json, the VS Code project schema and TypeScript types, and the Dagster Pydantic model. All regenerated viajust codegenand committed in this PR socodegen-drift.ymlstays green.Test plan
cd engine && cargo test -p rocky-ai— 26 passed (includes new Debug-leak + path-traversal regression tests)cd engine && cargo test -p rocky-core— 1122 passed (covers newAiSectionfield onRockyConfig)cd engine && cargo test -p rocky-cli— 296 passedcd engine && cargo clippy -p rocky-ai --all-targets -- -D warnings— cleancd engine && cargo clippy --all-targets -- -D warnings— cleancd engine && cargo fmt --check— cleanjust codegen— regeneratedrocky_project.schema.json,editors/vscode/schemas/rocky-project.schema.json,editors/vscode/src/types/generated/rocky_project.ts,integrations/dagster/src/dagster_rocky/types_generated/rocky_project_schema.pycd integrations/dagster && uv run pytest— 458 passed