feat(recipe): context-aware DataProvider interface#1121
Merged
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Threads context.Context into DataProvider.ReadFile / WalkDir so callers
operating against a slow or hung backing store (e.g., a stalled
NFS/sshfs mount on --data) can cancel the operation at I/O boundaries.
Cancellation is honored before each file open and between WalkDir
entries — not mid-syscall on an in-flight read; LayeredDataProvider
reads external files through os.Open + io.ReadAll, which are not
cancelable once started. A hung mount blocked mid-readv sees the
cancellation honored on the next file the walk touches, not the one
currently blocked. Embedded reads cannot hang but honor cancellation
between entries for symmetry.
Cascades ctx through the catalog and validator entry points called out
in the issue acceptance criteria:
- DataProvider.ReadFile(ctx, path) / WalkDir(ctx, root, fn)
- EmbeddedDataProvider, LayeredDataProvider, in-memory test providers
- catalog.LoadWithDataProvider(ctx, dp, version, commit)
- validator.ValidatePhases / ValidatePhase propagate their ctx
- recipe.HydrateResultWithContext, GetValuesForComponentWithContext,
GetManifestContentWithContext (existing methods kept as bounded
back-compat wrappers via defaults.FileReadTimeout)
Pre-existing public APIs that don't yet take ctx gain a *WithContext
variant; the no-ctx form derives a bounded background ctx so a hung
mount still returns instead of hanging the goroutine.
Adds TestDataProvider_HonorsContextCancellation acceptance test
verifying ReadFile/WalkDir return context.Canceled when the caller
cancels mid-read on both embedded and layered providers.
Review feedback addressed:
- validator.ValidatePhases / ValidatePhase now use
errors.PropagateOrWrap (preserves catalog.LoadWithDataProvider's
structured error code) instead of errors.Wrap.
- DataProvider interface godoc and pkg/recipe/components.go
loadComponentRegistryFor godoc clarified to state the real
cancellation scope (I/O boundaries, not mid-syscall).
- GetComponentRegistryFor now evicts transient-error cache entries
(context.Canceled / DeadlineExceeded) so a one-time first-load
cancellation isn't cached for the provider's lifetime — mirrors
LoadMetadataStoreFor's existing behavior.
- getMergedRegistry / getMergedCatalog sync.Once caching invariant
documented (latent issue: a long-lived LayeredDataProvider reused
across requests with cancelable contexts could poison the cache).
- docs/contributor/data.md DataProvider interface code block updated
to the new ctx-aware signature.
Closes #1109.
252aacd to
8961d0d
Compare
Contributor
Contributor
Coverage Report ✅
Coverage BadgeMerging this branch will decrease overall coverage
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. |
12 tasks
25 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
context.ContextthroughDataProvider.ReadFile/WalkDirso callers operating against slow/hung backing stores (stalled NFS/sshfs--datamounts) can cancel mid-read.catalog.LoadWithDataProviderand thevalidator.ValidatePhases/ValidatePhaseentry points called out in the issue acceptance criteria.catalog.Load→catalog.LoadWithDataProvider).Motivation / Context
Fixes #1109. The pre-existing exposure: a hung NFS / sshfs read on a
--datadirectory would block the validator goroutine until process termination, since the surroundingValidatePhases(ctx)cancellation couldn't interrupt the in-flightReadFile. Embedded reads cannot hang; the exposure is narrow but real for network-backed--datamounts.Type of Change
Components Affected
pkg/recipe)pkg/validator)pkg/cli)pkg/api)pkg/bundler)Implementation Notes
Interface change.
DataProvider.ReadFile(ctx, path)andWalkDir(ctx, root, fn)now takecontext.Context. Embedded checksctx.Err()before reading and between entries; Layered forwards ctx to embedded reads and insidemergeEmbeddedAndExternal. The boundedWalkDir(ctx)guard inLayeredDataProvidershort-circuits on a stalled network mount.Catalog entry point.
catalog.LoadWithDataProvider(ctx, dp, version, commit)is the new canonical entry;catalog.Load(...)becomes a back-compat wrapper that derives adefaults.FileReadTimeout-boundedcontext.Background().Validator propagation.
validator.ValidatePhasesandValidatePhaseforward their incoming ctx directly tocatalog.LoadWithDataProviderso the surrounding--timeoutreaches the catalog read.Public-API back-compat. Existing helpers without ctx in their signature (
GetManifestContent,RecipeResult.GetValuesForComponent,HydrateResult) gain*WithContextvariants; the no-ctx form derives a boundedcontext.Background()fromdefaults.FileReadTimeoutinternally so a hung mount still returns instead of parking the goroutine. Internal cascades that pass throughGetComponentRegistryFor(sync.Once cached, first-load bounded internally) carry//nolint:contextcheckwith a rationale comment.Cancellation test.
TestDataProvider_HonorsContextCancellationasserts both implementations returncontext.CanceledfromReadFileandWalkDirwhen the caller cancels mid-read.Drift cleanup (folded in). The user flagged three doc/code references left over from #1118:
docs/contributor/data.mdstill described the criteria registry as process-global (it's per-provider now);pkg/recipe/doc.gostill said the CLI/API server usesSetDataProvider(they useaicr.ClientwithWithRecipeSource);docs/contributor/validator.mdreferenced the oldcatalog.Loadentry point.Testing
go test -race ./...— all packages pass (only pre-existingpkg/trustTUF/Sigstore network test fails in this sandbox; confirmed identical failure onmainviagit stash).golangci-lint run -c .golangci.yaml ./...— 0 issues.make qualify— passes except the same pre-existingpkg/trustnetwork test.Coverage (changed packages)
pkg/recipepkg/validator/catalogpkg/validatorpkg/clipkg/apipkg/bundlerpkg/mirrorpkg/client/v1New test (
TestDataProvider_HonorsContextCancellation) covers the four new code paths added in this PR (Embedded.ReadFile + WalkDir ctx checks, Layered.ReadFile + WalkDir ctx checks).Risk Assessment
Low.
GetManifestContent,Load,HydrateResult, etc.) keep the no-ctx public shape working unchanged for external consumers.defaults.FileReadTimeout) ensures the no-ctx call sites still get a timeout rather than the unbounded blocking that existed before.Checklist
make test)make lint)make qualify)Co-Authored-Bylines-S)Closes #1109.