Part of #983.
Summary
recipe.SetDataProvider mutates a process-global, and both loadMetadataStore (pkg/recipe/metadata_store.go) and GetComponentRegistry (pkg/recipe/components.go) cache their loaded state via sync.Once against whatever provider was set when the cache first populated.
Two Builders in the same process — say, an in-process operator handling multiple recipe sources, or a test harness exercising two layered providers — silently share whichever cache populated first, regardless of which provider was last SetDataProvider'd. The store and registry returned to the second Builder are not the ones from its own source.
Add a WithDataProvider(dp) builder option and per-DataProvider caches so each Builder resolves against its own source without interference.
Motivation
The current shape works for AICR's shipping consumers (CLI, API server) because each is a single-tenant process with one Builder. It does not work — and cannot be made to work safely — for any in-process consumer that needs to maintain more than one recipe source over its lifetime. The hazard is dormant only because no in-tree caller triggers it.
The underlying pattern is documented as an anti-pattern in .claude/CLAUDE.md:
sync.Once caching state that depends on a settable global (e.g., a registry tied to a DataProvider): Key the cache by a generation counter the setter increments; recompute on miss so late-bound configuration takes effect.
The approach below does one better: key the cache by DataProvider identity, so concurrent in-process consumers with different providers each get the correct cached state with no eviction or recomputation needed.
Proposed change
1. WithDataProvider builder option
In pkg/recipe/builder.go:
// WithDataProvider binds the Builder to a specific DataProvider,
// isolating its metadata store and component registry from the
// process-global ones at GetDataProvider().
//
// Use this from any caller that constructs more than one Builder
// per process. When unset, the Builder falls back to the
// package-global DataProvider — preserving the CLI and API
// server behavior.
func WithDataProvider(dp DataProvider) Option
The Builder gains a private dp DataProvider field. When set, all internal calls that previously short-circuited through GetDataProvider() route through b.dp instead.
2. Per-provider metadata store cache
Replace the sync.Once-backed package globals in pkg/recipe/metadata_store.go with a sync.Map[DataProvider]*entry keyed by provider:
var storeCache sync.Map // map[DataProvider]*storeCacheEntry
type storeCacheEntry struct {
once sync.Once
store *MetadataStore
err error
}
loadMetadataStore takes a dp DataProvider parameter. When nil, falls back to GetDataProvider() so the existing call sites (Builder without WithDataProvider) continue to hit a single cache entry.
3. Per-provider component registry cache
Same treatment for pkg/recipe/components.go. The existing GetComponentRegistry() becomes a thin shim that calls a new getComponentRegistryFor(dp) against the global provider. A new public entry point lets callers ask for the registry of a specific provider.
4. Eviction helpers
Long-running in-process consumers that retire providers over time (e.g., when a configuration source is deleted) need a way to drop the corresponding cache entries — otherwise memory grows monotonically with the number of unique providers ever observed.
// EvictCachedStore drops the cached MetadataStore for the supplied
// DataProvider. The next loadMetadataStore call for that provider
// will rebuild from scratch. No-op if no entry exists.
func EvictCachedStore(dp DataProvider)
// EvictCachedRegistry drops the cached ComponentRegistry for the
// supplied DataProvider. No-op if no entry exists.
func EvictCachedRegistry(dp DataProvider)
Both must be safe on nil providers (return without panicking).
5. Store carries its provider
MetadataStore should carry the DataProvider it was loaded from so any store-internal lookup (e.g., resolving a component registry default) uses the same provider, not the package-global. This closes a subtle drift window where the store and registry could disagree if SetDataProvider is called between their loads.
Tests
- Add tests in
pkg/recipe/builder_test.go that construct two Builders via NewBuilder(WithDataProvider(dpA)) and NewBuilder(WithDataProvider(dpB)) against fixtures with materially different component sets, then resolve the same criteria on both concurrently (with -race). Assert each result reflects its own provider's components.
- Eviction tests: load a store, evict, load again, assert a fresh read (not a cache hit). Confirm nil-provider eviction is a no-op.
- Regression tests for the existing single-provider path: every existing recipe test must pass unchanged, demonstrating back-compat for CLI / API server.
Acceptance
recipe.WithDataProvider, recipe.EvictCachedStore, recipe.EvictCachedRegistry, and (if added) recipe.GetComponentRegistryFor are public, documented, and tested.
make test -race ./pkg/recipe/... clean, including the new multi-provider concurrency tests.
- CLI / API server behavior unchanged (verify via existing integration tests in
tests/chainsaw/ and the KWOK matrix).
- Doc update in
docs/contributor/data.md covering the new option and when to use it.
Out of scope
- Deprecating
SetDataProvider / GetDataProvider. Covered by a separate child issue, gated on this one.
- Migrating CLI / API server call sites to use
WithDataProvider. Optional follow-up.
- Pure performance work — the cache shape is per-provider but no benchmarking required as part of this change.
Dependencies
- Soft dependency: the deep-clone fix (sibling child issue) should ideally land first. Per-provider caches widen the blast radius of the existing aliasing bug — any test that hits two providers concurrently and mutates a returned
Validation field could expose it. Order matters more for clean review than for correctness; both can be in flight simultaneously.
- Hard dependency for the SetDataProvider deprecation (sibling child issue, depends on this one).
Part of #983.
Summary
recipe.SetDataProvidermutates a process-global, and bothloadMetadataStore(pkg/recipe/metadata_store.go) andGetComponentRegistry(pkg/recipe/components.go) cache their loaded state viasync.Onceagainst whatever provider was set when the cache first populated.Two Builders in the same process — say, an in-process operator handling multiple recipe sources, or a test harness exercising two layered providers — silently share whichever cache populated first, regardless of which provider was last
SetDataProvider'd. The store and registry returned to the second Builder are not the ones from its own source.Add a
WithDataProvider(dp)builder option and per-DataProvidercaches so each Builder resolves against its own source without interference.Motivation
The current shape works for AICR's shipping consumers (CLI, API server) because each is a single-tenant process with one Builder. It does not work — and cannot be made to work safely — for any in-process consumer that needs to maintain more than one recipe source over its lifetime. The hazard is dormant only because no in-tree caller triggers it.
The underlying pattern is documented as an anti-pattern in
.claude/CLAUDE.md:The approach below does one better: key the cache by
DataProvideridentity, so concurrent in-process consumers with different providers each get the correct cached state with no eviction or recomputation needed.Proposed change
1.
WithDataProviderbuilder optionIn
pkg/recipe/builder.go:The Builder gains a private
dp DataProviderfield. When set, all internal calls that previously short-circuited throughGetDataProvider()route throughb.dpinstead.2. Per-provider metadata store cache
Replace the
sync.Once-backed package globals inpkg/recipe/metadata_store.gowith async.Map[DataProvider]*entrykeyed by provider:loadMetadataStoretakes adp DataProviderparameter. When nil, falls back toGetDataProvider()so the existing call sites (Builder withoutWithDataProvider) continue to hit a single cache entry.3. Per-provider component registry cache
Same treatment for
pkg/recipe/components.go. The existingGetComponentRegistry()becomes a thin shim that calls a newgetComponentRegistryFor(dp)against the global provider. A new public entry point lets callers ask for the registry of a specific provider.4. Eviction helpers
Long-running in-process consumers that retire providers over time (e.g., when a configuration source is deleted) need a way to drop the corresponding cache entries — otherwise memory grows monotonically with the number of unique providers ever observed.
Both must be safe on nil providers (return without panicking).
5. Store carries its provider
MetadataStoreshould carry theDataProviderit was loaded from so any store-internal lookup (e.g., resolving a component registry default) uses the same provider, not the package-global. This closes a subtle drift window where the store and registry could disagree ifSetDataProvideris called between their loads.Tests
pkg/recipe/builder_test.gothat construct two Builders viaNewBuilder(WithDataProvider(dpA))andNewBuilder(WithDataProvider(dpB))against fixtures with materially different component sets, then resolve the same criteria on both concurrently (with-race). Assert each result reflects its own provider's components.Acceptance
recipe.WithDataProvider,recipe.EvictCachedStore,recipe.EvictCachedRegistry, and (if added)recipe.GetComponentRegistryForare public, documented, and tested.make test -race ./pkg/recipe/...clean, including the new multi-provider concurrency tests.tests/chainsaw/and the KWOK matrix).docs/contributor/data.mdcovering the new option and when to use it.Out of scope
SetDataProvider/GetDataProvider. Covered by a separate child issue, gated on this one.WithDataProvider. Optional follow-up.Dependencies
Validationfield could expose it. Order matters more for clean review than for correctness; both can be in flight simultaneously.