Create a new IDE specific secret scanning route#900
Conversation
Differences between the current route and new one: * New route passes the JSON:API based format for the rule so the IDE is not in the job of translation. This allows the K9 team to evolve the rule inputs without involving the IDE team * New route disables the secret validation inside of the binary instead of having the IDE drop the fields on the floor before passing the rules to the binary * Some logic was removed like # of rules and file name validation. Both seemed out of the realm for the IDE route to enforce Secret scaner cache changed to work with both routes by taking the JSON->SecretReule logic as a lambda.
|
🎯 Code Coverage (details) 🔗 Commit SHA: 3b75bf1 | Docs | Datadog PR Page | Give us feedback! |
There was a problem hiding this comment.
Pull request overview
Adds a new IDE-specific secrets scanning endpoint that accepts JSON:API rule payloads, disables online secret validation for IDE usage, and adapts the scanner cache to support multiple rule-parsing formats.
Changes:
- Introduces
disable_validationinAnalysisOptionsand gates SDS match validation behind it. - Refactors
SecretScannerCacheto accept a caller-provided rule parser on cache miss; updates existing secret-scan path to use it. - Adds
/ide/v1/secrets/scanendpoint (mounted under/ide) with end-to-end tests and updates IDE routing + request samples.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/static-analysis-server/src/request.rs | Updates AnalysisOptions construction to use ..Default::default() for new fields. |
| crates/static-analysis-kernel/src/analysis/analyze.rs | Updates test AnalysisOptions literals for new fields via ..Default::default(). |
| crates/secrets/src/scanner.rs | Adds disable_validation gate around scanner.validate_matches. |
| crates/common/src/analysis_options.rs | Adds disable_validation: bool with default false. |
| crates/bins/src/bin/datadog_static_analyzer_server/secret_scanner_cache.rs | Refactors cache API to accept a parsing lambda on miss; updates tests. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/secrets/models.rs | Defines IDE request/response models for secret scanning. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/secrets/mod.rs | Adds IDE secrets module + end-to-end endpoint tests. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/secrets/endpoints.rs | Implements IDE secrets scan route using JSON:API rule parsing + disables validation. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/requests.http | Updates sample requests (host + config parse examples). |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/mod.rs | Registers the new IDE secrets route. |
| crates/bins/src/bin/datadog_static_analyzer_server/endpoints.rs | Updates legacy secret scanning route to use the new cache API with a parser closure. |
| crates/bins/src/bin/datadog-static-analyzer.rs | Updates AnalysisOptions literal to include new field via default update. |
| crates/bins/src/bin/datadog-static-analyzer-test-ruleset.rs | Updates AnalysisOptions literal to include new field via default update. |
| crates/bins/src/bin/datadog-static-analyzer-git-hook.rs | Updates AnalysisOptions literal to include new field via default update. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Callers supply the parser used on cache miss. The hash key is the raw JSON bytes, | ||
| /// so callers that feed the same raw rules share the same cache entry regardless of | ||
| /// which parser they use. |
There was a problem hiding this comment.
The cache key is derived only from the raw JSON bytes, but the docstring claims callers “share the same cache entry regardless of which parser they use”. That isn’t guaranteed: two different parsers could produce different Vec<SecretRule> from identical raw JSON, and a later call would silently reuse the first parsed rules/scanner. Consider either (a) including a parser/version identifier in the cache key, or (b) tightening the contract/doc to state that the parser must be deterministic and semantically equivalent for a given raw_rules payload (and ideally enforce this by only exposing route-specific helpers).
| SecretRule::try_from(api_rule) | ||
| .map_err(|e| format!("Failed to convert secret rule: {e}")) |
There was a problem hiding this comment.
SecretRule::try_from(api_rule) can panic for certain malformed JSON: the TryFrom<SecretRuleApiMatchValidation> implementation uses expect(...) for required fields when match_validation.type == "CustomHttp" (e.g., missing endpoint/http_method). Because this endpoint accepts rule JSON from the IDE, a crafted request could trigger a panic in the blocking task and turn a client error into an internal error/DoS. Add explicit validation before calling try_from, or wrap the conversion in std::panic::catch_unwind and return a structured “Failed to convert secret rule” error instead of panicking.
| SecretRule::try_from(api_rule) | |
| .map_err(|e| format!("Failed to convert secret rule: {e}")) | |
| match std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { | |
| SecretRule::try_from(api_rule) | |
| })) { | |
| Ok(rule) => rule.map_err(|e| format!("Failed to convert secret rule: {e}")), | |
| Err(_) => Err( | |
| "Failed to convert secret rule: malformed secret rule payload" | |
| .to_string(), | |
| ), | |
| } |
There was a problem hiding this comment.
How does k9 want to handle this? Should try_from not panic, this affects the CLI so unsure on what route to take
There was a problem hiding this comment.
Yeah, try_from should never panic. The only problematic thing I see is:
(Unless I am missing something). However, it's not clear to me if that data is controlled by our backend or potentially the user. If it's our backend, I'm less worried about an unwrap here, but we would want to convert it to an expect specifically so we can explain that assumption.
Otherwise, if it's user-provided data, we should probably use unwrap_or to set a reasonable default.
There was a problem hiding this comment.
This should be coming from the backend since its triggered by the JSON:API format -> SecretRule conversion
That code already will panic on Err:
So having this return Err seems best to me? Also the panics at
There was a problem hiding this comment.
492a026 should remove all the panics from the code path
| SecretRule::try_from(api_rule) | ||
| .map_err(|e| format!("Failed to convert secret rule: {e}")) |
There was a problem hiding this comment.
Yeah, try_from should never panic. The only problematic thing I see is:
(Unless I am missing something). However, it's not clear to me if that data is controlled by our backend or potentially the user. If it's our backend, I'm less worried about an unwrap here, but we would want to convert it to an expect specifically so we can explain that assumption.
Otherwise, if it's user-provided data, we should probably use unwrap_or to set a reasonable default.
What problem are you trying to solve?
IDE-5981
Differences between the current route and new one:
Secret scanner cache changed to work with both routes by taking the JSON->SecretReule logic as a lambda.