fix(evidence): fix a regression in cncf ai conformance evidence collection#889
Conversation
📝 WalkthroughWalkthroughThis PR changes Collector.resolveFeatures to expand empty inputs and any feature that resolves to featureAll into a cloned ValidFeatures slice using slices.Clone. The change replaces returning a literal featureAll token with returning the concrete per-feature list. Two unit tests were added to verify expansion behavior and that the returned slice is an independent clone. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
b711059 to
752f33c
Compare
`aicr validate --cncf-submission` (with or without explicit `--feature all`) was reliably SIGKILL'd after 5 minutes on real clusters. Cause: resolveFeatures() collapsed empty input and the `all` wildcard into a single ["all"] section, so the Go loop fired one shell-out (`bash collect-evidence.sh all`) bounded by EvidenceSectionTimeout (5 min) — far too short for the bash script's internal nine-feature loop, which has historically needed ~10–15 min. The collapse predated EvidenceSectionTimeout; before PR NVIDIA#721 introduced the per-section deadline, the only ceiling was the 20-minute CNCFSubmissionTimeout, which `all` happily fit under. The hardening shortened the budget without recognizing one of the "sections" was a nine-feature aggregate. Fix: expand `[]` and the `all` token into the full ValidFeatures list inside resolveFeatures(), so each feature runs as its own runSection invocation with its own 5-minute budget. Total wall time stays bounded by the existing 20-minute CNCFSubmissionTimeout. The bash script's `case all)` block is preserved for direct invocation (e.g., `bash collect-evidence.sh all`) — only the Go path stops using it. Regression coverage: - TestResolveFeaturesExpandsAll asserts that nil, [], ["all"], and ["dra","all","gang"] all expand to ValidFeatures; explicit subsets pass through unchanged; alias resolution is preserved. - TestResolveFeaturesReturnsIndependentSlice asserts the result is a defensive copy so callers can't mutate ValidFeatures by accident.
752f33c to
b5e3897
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/evidence/cncf/collector_test.go`:
- Around line 484-488: The test mutates the returned slice and currently risks
leaving ValidFeatures modified; before mutating (around the test that assigns
got[0] = "tampered" and compares ValidFeatures[0] to original) capture the
original value(s) of ValidFeatures (e.g., original := ValidFeatures[0]) and
register a t.Cleanup closure to restore them (reset ValidFeatures[0] = original)
so global state is restored after the test; ensure the cleanup runs even if the
assertion fails and reference the test helpers/variables resolveFeatures(),
ValidFeatures, got, and t.Cleanup when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: c601920a-e2c2-4eff-8b20-82fdc588beb8
📒 Files selected for processing (2)
pkg/evidence/cncf/collector.gopkg/evidence/cncf/collector_test.go
| original := ValidFeatures[0] | ||
| got[0] = "tampered" | ||
| if ValidFeatures[0] != original { | ||
| t.Errorf("ValidFeatures[0] = %q after mutating resolveFeatures() output; want %q — slice must be cloned", ValidFeatures[0], original) | ||
| } |
There was a problem hiding this comment.
Add cleanup to restore global state in mutation test.
Line 485 intentionally mutates the returned slice; if cloning regresses, this can leak into ValidFeatures and cascade into unrelated test failures. Restore the original value in t.Cleanup.
Proposed fix
func TestResolveFeaturesReturnsIndependentSlice(t *testing.T) {
c := NewCollector("/tmp")
got := c.resolveFeatures()
if len(got) == 0 {
t.Fatal("resolveFeatures() returned empty slice for default input")
}
original := ValidFeatures[0]
+ t.Cleanup(func() {
+ ValidFeatures[0] = original
+ })
got[0] = "tampered"
if ValidFeatures[0] != original {
t.Errorf("ValidFeatures[0] = %q after mutating resolveFeatures() output; want %q — slice must be cloned", ValidFeatures[0], original)
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| original := ValidFeatures[0] | |
| got[0] = "tampered" | |
| if ValidFeatures[0] != original { | |
| t.Errorf("ValidFeatures[0] = %q after mutating resolveFeatures() output; want %q — slice must be cloned", ValidFeatures[0], original) | |
| } | |
| original := ValidFeatures[0] | |
| t.Cleanup(func() { | |
| ValidFeatures[0] = original | |
| }) | |
| got[0] = "tampered" | |
| if ValidFeatures[0] != original { | |
| t.Errorf("ValidFeatures[0] = %q after mutating resolveFeatures() output; want %q — slice must be cloned", ValidFeatures[0], original) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/evidence/cncf/collector_test.go` around lines 484 - 488, The test mutates
the returned slice and currently risks leaving ValidFeatures modified; before
mutating (around the test that assigns got[0] = "tampered" and compares
ValidFeatures[0] to original) capture the original value(s) of ValidFeatures
(e.g., original := ValidFeatures[0]) and register a t.Cleanup closure to restore
them (reset ValidFeatures[0] = original) so global state is restored after the
test; ensure the cleanup runs even if the assertion fails and reference the test
helpers/variables resolveFeatures(), ValidFeatures, got, and t.Cleanup when
making the change.
Summary
Fix a regression introduced by PR #721.
aicr validate --cncf-submission(with empty--featureor explicit--feature all) was SIGKILL'd after 5 minutes. Expand theallwildcard Go-side into the fullValidFeatureslist so each feature gets its ownEvidenceSectionTimeoutbudget.Motivation / Context
Reproduced on a live EKS cluster:
--cncf-submissionwith no--featureflag (documented as "default: all") hangs and is SIGKILL'd after ~5 minutes withsignal: killed exitCode=5. Five of the nine markdown evidence files are written before the kill; the remaining four are lost. The same happens with explicit--feature all.Why now: PR #721 ("fix: address top-7 code-review findings across packages", merged 2026-04-30) added
EvidenceSectionTimeout = 5 * time.Minuteand wrapped eachrunSectioncall withcontext.WithTimeout(ctx, ...). The intent — bound runaway shell processes — is sound. Butpkg/evidence/cncf/collector.go::resolveFeatures()predates that change and collapses both empty input and theallwildcard to a single["all"]section, so the Go loop fires exactly one shell-out (bash collect-evidence.sh all) bound by that 5-minute ceiling. The bash script's internalcase all)block then has to run all nine features inside that window — far too short on real clusters (the per-feature budget at the script level used to be the 20-minuteCNCFSubmissionTimeout).Fixes: N/A
Related: regression introduced by #721
Type of Change
Component(s) Affected
pkg/validator) — specificallypkg/evidence/cncfImplementation Notes
resolveFeatures()now expands[]and thealltoken into a clone ofValidFeatures. The Go loop inCollector.Run()then iterates nine times, callingrunSection(<feature>)once per feature, with each invocation under its ownEvidenceSectionTimeout(5 min). Total wall time remains bounded by the outerCNCFSubmissionTimeout(20 min).case all)block is preserved for direct invocation (bash collect-evidence.sh all). Only the Go path stops routing through it.slices.Clone(ValidFeatures)so callers can't mutate the package-level slice via the returned value.featureAllconstant stays — still needed as the input-recognition token for--feature all.Testing
make qualify— green.Manual reproduction (before): on cluster
aicr2,aicr validate --phase conformance --cncf-submission --evidence-dir ...exits non-zero withevidence collection command failed: signal: killed exitCode=5at ~5 min, 5/9 evidence files written. With explicit per-feature flags as a workaround: completes in ~10 min, 9/9 files written.New regression coverage:
TestResolveFeaturesExpandsAll— nil,[],["all"], and["dra","all","gang"]all expand toValidFeatures; explicit subsets pass through unchanged; aliases resolve.TestResolveFeaturesReturnsIndependentSlice— guards the defensiveslices.Cloneso mutating the returned slice can't corrupt the package-levelValidFeatures.Existing
pkg/evidence/cncftests still pass; no per-package coverage decrease.Risk Assessment
--cncf-submissionis a separate code path that returns beforerunValidation.Rollout notes: No CLI flags change. Users who were working around this bug by enumerating
--featureflags by hand can drop those flags after this lands.Checklist
make testwith-race)make qualifypassesCo-Authored-Bylines-S)