Skip to content

feat(evidence): minimize sensitive content in evidence bundles#1414

Merged
mchmarny merged 4 commits into
NVIDIA:mainfrom
njhensley:feat/minimal-evidence
Jun 23, 2026
Merged

feat(evidence): minimize sensitive content in evidence bundles#1414
mchmarny merged 4 commits into
NVIDIA:mainfrom
njhensley:feat/minimal-evidence

Conversation

@njhensley

Copy link
Copy Markdown
Member

Summary

Evidence bundles are now minimized by default: snapshot.yaml is reduced to a fail-closed allowlist and per-test CTRF stdout/message are omitted, keeping sensitive operational detail out of the published artifact. A new --full flag restores the raw payloads.

Motivation / Context

An evidence bundle shipped the full validate-time snapshot and per-phase CTRF stdout, publishing detail that is sensitive to expose: node names, cloud provider instance IDs, the full node label/taint set, kernel/sysctl tuning, loaded modules, systemd service config, and raw container logs.

The verifiable trust signal is digest-based — the signed in-toto predicate commits to artifacts by hash and carries the derived fingerprint, criteria-match, and per-phase counts. The snapshot and CTRF logs are the backing content those digests point at, not the signal itself, so they can be minimized without weakening the cryptographic binding. This extends the same reasoning ADR-007 already applied to logs.

Fixes: #1345
Related: N/A

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • Build/CI/tooling

Component(s) Affected

  • CLI (cmd/aicr, pkg/cli)
  • Docs/examples (docs/, examples/)
  • Other: pkg/evidence/{redact,attestation,verifier}

Implementation Notes

  • New package pkg/evidence/redact — pure, deterministic, non-mutating Snapshot() and CTRF(). The snapshot allowlist is fail-closed: only enumerated measurement subtypes/keys survive, so a field a future collector adds is dropped until explicitly allowlisted. v1 keeps K8s.server, an allowlisted K8s.node subset (drops source-node, provider-id, container-runtime-id), GPU.hardware, OS.release, NodeTopology.summary; drops OS.{grub,sysctl,kmod}, all of SystemD, NodeTopology.{label,taint}, subtype Context, and header metadata.source-node.
  • Fingerprint is computed from the raw snapshot, not the shipped one. fingerprint.FromMeasurements reads fields the allowlist drops (e.g. NodeTopology.label), so deriving the predicate's fingerprint/criteriaMatch from the full snapshot keeps the conformance signal identical whether or not the shipped bytes were minimized. The verifier trusts the fingerprint via the signature; it never recomputes it from snapshot.yaml.
  • Self-verifying. The predicate digests (recipe, CTRF, BOM, manifest) cover the bytes that actually ship, so a minimal bundle passes aicr evidence verify. The applied policy is recorded in a new predicate.redaction block (omitempty), so a --full predicate is byte-identical to prior output.
  • Default behavior change: aicr validate --emit-attestation now writes a redacted bundle. No documented downstream consumer (recipe-health, CNCF submission) reads the bundle's snapshot.yaml back. --full is the opt-out.
  • Not a schema bump: redaction is an additive omitempty predicate field, consistent with ADR-007's additive-field policy; PredicateSchemaVersion stays 1.0.0.

Testing

go test ./pkg/evidence/... ./pkg/cli/...          # pass
golangci-lint run -c .golangci.yaml ./...          # 0 issues (pinned v2.12.2)
  • pkg/evidence/redact coverage 98.4%; new round-trip test confirms a minimal bundle self-verifies (TestVerify_MinimalBundleSelfVerifies); --full byte-identity and fingerprint-from-raw decoupling are guarded by tests.
  • Not runnable in my local sandbox (no C compiler for -race; sigstore.dev TUF/Fulcio network blocked) — the -race run and the pkg/bundler/attestation/pkg/trust sigstore tests rely on CI. Those packages do not import the changed code.

Risk Assessment

  • Low — Isolated change, well-tested, easy to revert
  • Medium — Changes the default contents of the published evidence bundle
  • High — Breaking change, affects critical paths, or complex rollout

Rollout notes: Default --emit-attestation output is now redacted; operators who want the prior full payloads pass --full. Verification is unaffected (minimal bundles self-verify; --full predicates are byte-identical to prior output). No migration required.

Checklist

  • Tests pass locally (make test with -race) — non-race suite passes locally; -race requires CI (no local C compiler)
  • Linter passes (make lint) — pinned golangci-lint v2.12.2, 0 issues
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality
  • I updated docs if user-facing behavior changed (ADR-007, cli-reference, validation, artifact-verification)
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

…ault

Evidence bundles shipped the full snapshot and per-phase CTRF stdout,
publishing operational detail that is sensitive to expose: node names,
provider instance IDs, the node label/taint set, kernel/sysctl tuning,
loaded modules, systemd config, and raw container logs.

The verifiable signal is digest-based — the signed predicate commits to
artifacts by hash and carries the derived fingerprint, criteria-match,
and per-phase counts. The snapshot and CTRF logs are backing content
those digests point at, not the signal itself, so they can be minimized
without weakening the binding.

Bundles are now minimized by default; --full restores the raw payloads:
- snapshot.yaml is reduced to a fail-closed allowlist (unlisted
  measurement types/subtypes/keys drop by default)
- per-test CTRF stdout/message are omitted
- the predicate records the applied policy in a new `redaction` block

The predicate's fingerprint/criteriaMatch are still computed from the
raw snapshot, so the conformance signal is unchanged; only the shipped
bytes shrink. Digests cover whatever ships, so a minimal bundle
self-verifies with `aicr evidence verify`. A --full predicate is
byte-identical to prior output (redaction omitted).

Adds pkg/evidence/redact, the --full flag, verifier output for the
policy, and updates ADR-007 and CLI/verification docs.

Closes NVIDIA#1345
@njhensley njhensley requested a review from a team as a code owner June 23, 2026 06:19
@njhensley njhensley added the theme/supply-chain SLSA, SBOM, Sigstore, and provenance verification label Jun 23, 2026
@github-actions

Copy link
Copy Markdown
Contributor

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4c498a8d-8429-4d23-bf06-3ff7dcde51cc

📥 Commits

Reviewing files that changed from the base of the PR and between 1535075 and 7c59e8c.

📒 Files selected for processing (1)
  • docs/user/cli-reference.md

📝 Walkthrough

Walkthrough

This PR implements minimal-by-default evidence bundle redaction. A new pkg/evidence/redact package provides two pure, fail-closed transforms: Snapshot applies an allowlist that drops sensitive header fields (source-node) and measurement subtypes/keys, and CTRF strips per-test Stdout and Message while preserving pass/fail signal. A new RedactionInfo type is added to the attestation predicate schema to record applied policy. emit.go gains an applyRedaction helper that orchestrates both transforms and threads provenance into BuildOptions.Redaction; fingerprint and criteria are computed from the raw snapshot regardless of redaction. A new --full boolean CLI flag on aicr validate --emit-attestation bypasses redaction and omits the redaction predicate field. The verifier's markdown report gains a "Redaction" section when the predicate contains redaction info. ADR-007 and user-facing docs are updated to reflect the new default behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main feature: minimizing sensitive content in evidence bundles by default with a new --full flag.
Description check ✅ Passed The description comprehensively explains the motivation (sensitive operational details in bundles), the implementation approach (fail-closed allowlist, CTRF redaction, --full flag, new redact package), and confirms verification signal preservation through digest-based binding.
Linked Issues check ✅ Passed All acceptance criteria from issue #1345 are addressed: minimal bundle definition documented in ADR-007, sensitive snapshot/CTRF fields redacted by default, minimal bundles self-verify (TestVerify_MinimalBundleSelfVerifies confirms), deterministic redaction with policy recorded in predicate.redaction, and docs updated (007-recipe-evidence.md, cli-reference, validation, artifact-verification).
Out of Scope Changes check ✅ Passed All changes align with the stated objectives: redact package for snapshot/CTRF minimization, CLI --full flag integration, predicate schema extension for redaction provenance, and comprehensive documentation updates. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/evidence/redact/redact.go`:
- Around line 211-223: The copyHeaderDroppingSourceNode function currently uses
a blacklist approach that only drops the source-node key and passes through all
other metadata keys, which is a fail-open posture that risks exposing future
metadata keys without explicit approval. Switch to an allowlist approach where
you define a set of allowed metadata keys that are safe to include, and only
copy those specific keys from the input header's Metadata map to the output
Metadata map. This ensures the minimal-policy fail-closed posture where any new
metadata keys are excluded by default until explicitly approved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 75f87e60-74e9-4c27-bc4d-79f784f2d2c8

📥 Commits

Reviewing files that changed from the base of the PR and between a1b6e20 and 4b85481.

📒 Files selected for processing (19)
  • docs/design/007-recipe-evidence.md
  • docs/user/artifact-verification.md
  • docs/user/cli-reference.md
  • docs/user/validation.md
  • pkg/cli/consts.go
  • pkg/cli/validate.go
  • pkg/cli/validate_evidence.go
  • pkg/cli/validate_evidence_test.go
  • pkg/evidence/attestation/builder.go
  • pkg/evidence/attestation/emit.go
  • pkg/evidence/attestation/emit_test.go
  • pkg/evidence/attestation/predicate.go
  • pkg/evidence/attestation/predicate_test.go
  • pkg/evidence/attestation/types.go
  • pkg/evidence/redact/redact.go
  • pkg/evidence/redact/redact_test.go
  • pkg/evidence/verifier/report.go
  • pkg/evidence/verifier/report_redaction_test.go
  • pkg/evidence/verifier/verify_minimal_test.go

Comment thread pkg/evidence/redact/redact.go Outdated
Header metadata redaction dropped only source-node and forwarded every
other key, a fail-open posture inconsistent with the allowlist policy
used elsewhere. Restrict to a {timestamp, version} allowlist so future
metadata keys are excluded by default. Addresses CodeRabbit review.
@mchmarny mchmarny enabled auto-merge (squash) June 23, 2026 12:29
@mchmarny mchmarny disabled auto-merge June 23, 2026 12:36

@mchmarny mchmarny left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The core insight — that the trust signal is digest-based, so the snapshot/CTRF backing content can be minimized without weakening the binding, and the fingerprint must therefore be derived from the raw snapshot rather than the shipped bytes — is exactly right, and the implementation matches it cleanly. --full byte-identity, the omitempty redaction block, and self-verification of minimal bundles are all handled well.

One medium, non-blocking note inline: the "fail-closed" guarantee is only partial — the unconstrained keep()-all subtypes are fail-open at the key level, which the package doc overclaims. Worth reconciling doc and implementation, but nothing leaks today and it doesn't gate this merge.

Nice work — the ADR-007 extension and the docs are thorough.

// when the allowlist changes.
var snapshotAllowlist = map[measurement.Type]map[string]subtypePolicy{
measurement.TypeK8s: {
"server": keep(), // version etc. — not sensitive

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium, non-blocking — fail-closed claim is only partial. The package doc (lines 27-31) says "anything a future collector adds is dropped until explicitly allowlisted," but that holds only at type and subtype granularity, plus key granularity for key-constrained subtypes like node. The keep()-all subtypes here (server, hardware, release, summary) are fail-open at the key level: a sensitive key a future collector attaches to one of these ships by default.

Nothing leaks today — verified these four currently carry only version/model/count/distro-identity keys. But the asymmetry with node (which you correctly key-constrained because it has sensitive keys) is exactly the trap. Two options: key-constrain these four as well, or tighten the doc and add an invariant note that server/hardware/release/summary must never accrue node-identifying keys (and bump PolicyVersion if they do). Either way the stated guarantee and the implementation should match.

@mchmarny mchmarny enabled auto-merge (squash) June 23, 2026 12:54
@mchmarny mchmarny merged commit 03b14d9 into NVIDIA:main Jun 23, 2026
33 checks passed
@njhensley njhensley deleted the feat/minimal-evidence branch June 23, 2026 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli area/docs size/XL theme/supply-chain SLSA, SBOM, Sigstore, and provenance verification

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Minimize sensitive content in evidence bundles while preserving the verifiable signal

2 participants