feat(recipe): shared coordinate mapping + taxonomy spec (TG6)#1409
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Enterprise Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughA new ADR ( Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/design/012-recipe-coordinate-mapping.md`:
- Around line 125-127: The fenced code block containing the path template
`<group>/<dashboard>/<tab>` is missing a language identifier on the opening
fence marks, which causes markdownlint MD040 to trigger. Add a language
identifier such as `text` to the opening backticks of the code block to resolve
the linting error and keep the documentation lint-clean.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: fc6dd4aa-9a89-4702-9ff1-9f3aac9dcf58
📒 Files selected for processing (3)
docs/design/012-recipe-coordinate-mapping.mdpkg/recipe/coordinate.gopkg/recipe/coordinate_test.go
Coverage Report ✅
Coverage BadgeMerging this branch will increase overall coverage
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. |
fc95522 to
54c4447
Compare
njhensley
left a comment
There was a problem hiding this comment.
Multi-persona review + senior-engineer meta-review
Reviewed by four independent personas (correctness, API-contract, docs, test), then adjudicated by a senior-engineer meta-review that calibrated severity to this PR's additive, contract-publishing scope (no consumers wired yet).
TL;DR — no blockers. Clean, pure, well-tested function. The personas converged on one real defect: the coordinate string is provably not round-trippable (rtx-pro-6000 and --data values collide with the -// delimiters). But since nothing parses Path() yet, that is a High-priority design issue to resolve with the first parsing consumer (RQ1/TG4a) — not a merge blocker. Deep-links and the presence endpoint only need the string stable and equal, not decomposable.
Recommendation: merge-with-changes (minimal) + tracked follow-ups.
- Fold into this PR (cheap, contract-strengthening): the
/charset guard, the dead-filter test cleanup, the## Statusline, and rewording the 7-key schema as "specified here, enforced by TG2." - Block on the first consumer issue (cross-linked from ADR-012): the round-trippability fix /
ParseCoordinate/ delimiter invariant — so no string-splitting consumer can land before parseability is settled.
Inline comments below, tiered High → Nitpick. One persona finding was dismissed: "the Coordinate struct omits the metadata.name join key" — the ADR deliberately keeps name (identity) off Coordinate (placement) since the caller holds both; that's a correct design choice.
| // host/navigation scheme built around this path is owned by the | ||
| // consumer (GP5/TG4a); this package only emits the canonical segments. | ||
| func (co Coordinate) Path() string { | ||
| return co.Group + "/" + co.Dashboard + "/" + co.Tab |
There was a problem hiding this comment.
[High] Path() is not round-trippable, but ADR-012 markets it as the parseable "stable URL form."
Dashboard = accelerator + "-" + os joins two free-form fields with -, and CriteriaAcceleratorRTXPro6000 = "rtx-pro-6000" (criteria.go:121) already contains hyphens. So eks/rtx-pro-6000-ubuntu/training has no positional split point — a consumer cannot recover (rtx-pro-6000, ubuntu) vs (rtx, pro-6000-ubuntu). ADR §"Stable URL form" says RQ1 deep-links to and TG4a's presence endpoint checks this string.
Not a blocker for this PR — nothing parses Path() yet; deep-links/presence checks only need it stable+equal, not decomposable. Recommend: track on the first parsing consumer (RQ1 #1283 / TG4a #1284) — ship an explicit ParseCoordinate, or keep consumers on the struct fields rather than string-splitting. Add an ADR note that Path() is a stable opaque identity, not a decomposable key, until then.
There was a problem hiding this comment.
Documented (820eef3). Added an ADR note under "Stable URL form": `Path()` is a stable opaque identity, not a decomposable key — `rtx-pro-6000` already carries the join char, so consumers compare/deep-link the string but address dimensions via the `Coordinate` struct fields, never by splitting. An explicit `ParseCoordinate` (with a "values must not contain the active delimiter" invariant) is deferred to the first parsing consumer (RQ1 #1283 / TG4a #1284).
| // non-wildcard) dimension, or an ErrCodeInvalidRequest naming dim when it | ||
| // is empty or the "any" wildcard. | ||
| func requireConcrete(dim, value string) (string, error) { | ||
| if value == "" || value == CriteriaAnyValue { |
There was a problem hiding this comment.
[High] No charset guard → a / in a value silently breaks the segment count.
requireConcrete rejects only empty/"any". Registry values admitted via --data are merely lowercased/trimmed (no charset validation), so a service value like acme/ncp yields acme/ncp/h100-ubuntu/training — a 4-segment path that any positional / split mis-parses. That is exactly the "ambiguous placement / silent fusion" the ADR claims to prevent.
This one is cheap and self-contained — reasonable to fold into this PR: reject values containing / (and arguably - in the joined fields), failing closed with ErrCodeInvalidRequest naming the dimension, plus a negative test.
[Low, same spot] value == CriteriaAnyValue compares to the literal "any" without trim/lowercase, so "ANY"/" any " from a hand-built (unnormalized) Criteria would pass as concrete. Only reachable if a caller bypasses the resolve path — either normalize here or document that CoordinateFor assumes resolved input.
There was a problem hiding this comment.
Folded in (820eef3). `requireConcrete` now routes through a new `rejectPathSeparator`, so service/accelerator/os/intent and platform fail closed with `ErrCodeInvalidRequest` naming the dimension when a value contains `/`. Added negative cases (`service contains slash`, `platform contains slash`). On the [Low]: left as-is — the function documents its input as already-resolved `Criteria`, and normalizing `"any"`/case here would duplicate the resolve path; the resolved-input assumption is stated in the doc comment.
| func TestCoordinateForRegistryUnionConcrete(t *testing.T) { | ||
| r := NewCriteriaRegistry() | ||
|
|
||
| services := concreteValues(r.AllServiceTypes()) |
There was a problem hiding this comment.
[High] The registry-union test never actually exercises the registry-driven fail-closed path.
concreteValues() strips "any", but NewCriteriaRegistry() is empty and the static Get*Types() lists never contain "any" — so the filter removes a token that is never present, and no wildcard ever reaches CoordinateFor through this test. If concreteValues were removed or a --data registry ever surfaced "any", this test would still pass.
Fail-closed is covered by the literal cases in TestCoordinateFor (nil + each dimension any), so this is dead-code/clarity rather than a true coverage hole. Recommend: either feed an explicit "any"/empty into the loop and assert it fails closed, or drop the dead filter with a comment pointing at the literal cases that own that coverage.
There was a problem hiding this comment.
Fixed (820eef3). Added an explicit assertion that a wildcard service fails closed through the registry-driven path, plus a comment noting `concreteValues` is defensive against a `--data`-seeded registry surfacing `"any"` and that the literal per-dimension cases in `TestCoordinateFor` own the rest of that coverage.
| } | ||
|
|
||
| tab := intent | ||
| if platform := string(c.Platform); platform != "" && platform != CriteriaAnyValue { |
There was a problem hiding this comment.
[Medium] tab = intent-platform round-trips only by the unenforced accident that intents are hyphen-free.
Splitting training-kubeflow back to (training, kubeflow) works only because today's intents (training, inference) contain no -. A future/--data intent like fine-tuning would make fine-tuning-kubeflow un-splittable. Same delimiter family as the Path() issue above — the structural-parsing fix covers it. Recommend: add an explicit ADR invariant: "dimension values must not contain the active delimiter."
There was a problem hiding this comment.
Covered by the same opaque-identity note (820eef3): the ADR now states values may contain `-` (`rtx-pro-6000`, a future `fine-tuning` intent), so the tab is not `-`-decomposable, and the delimiter invariant is deferred to `ParseCoordinate`. Only `/` is enforced now, since it is the one char that changes the segment count.
| the recipe. | ||
| - `signer_identity` + `signer_issuer` together key the **latest-per-signer** | ||
| default scope referenced in the k8s-in-column countermeasure. | ||
| - The key strings and their count (7) are the contract: TG2 emits exactly these; |
There was a problem hiding this comment.
[Medium] The 7-key started.json schema is declared as a binding contract but is prose-only — nothing enforces it.
"TG2 emits exactly these; TG3 reads exactly these" reads as a frozen ABI, yet no Go constant/struct/test pins the key names, order, or count. A typo or reorder on either side drifts silently — the same anti-drift failure the ADR's Problem section warns against, now in the half of the contract that isn't code. Deferring the emitter to TG2 (#1267) is fine, but reword to: "specified here, enforced by TG2," rather than implying it is pinned in code.
| @@ -0,0 +1,232 @@ | |||
| # ADR-012: Recipe → Board-Coordinate Mapping | |||
There was a problem hiding this comment.
[Low] Missing ## Status section. 7 of 8 prior ADRs (002–005, 007–009) open with a ## Status. Add ## Status: Accepted (or Proposed) so consumers importing this contract know whether the mapping is ratified.
| name: "eks h100 ubuntu training kubeflow", | ||
| c: &Criteria{Service: CriteriaServiceEKS, Accelerator: CriteriaAcceleratorH100, | ||
| OS: CriteriaOSUbuntu, Intent: CriteriaIntentTraining, Platform: CriteriaPlatformKubeflow}, | ||
| want: "eks/h100-ubuntu/training-kubeflow", |
There was a problem hiding this comment.
[Low] No format-lock golden test independent of the fixtures. The canonical string is only asserted via the table's want fields; a refactor that changes the delimiter in both Path() and these want strings stays green while breaking the five downstream consumers that hardcode <group>/<dashboard>/<tab>. Consider one standalone Path() == "eks/h100-ubuntu/training-kubeflow" assertion explicitly framed as "this exact string must never change."
There was a problem hiding this comment.
Added `TestCoordinatePathFormatLock` (820eef3) — a standalone assertion that `Path()` equals `eks/h100-ubuntu/training-kubeflow`, framed as a frozen contract string that must fail even if the table want-strings change in lockstep.
| t.Fatalf("err = %v, wantErr %v", err, tt.wantErr) | ||
| } | ||
| if tt.wantErr { | ||
| if !stderrors.Is(err, errors.New(errors.ErrCodeInvalidRequest, "")) { |
There was a problem hiding this comment.
[Low] Negative cases assert the error code but not that the message names the offending dimension — which both the code comment and ADR explicitly promise ("naming the offending dimension"). A refactor dropping the dim name from the message (e.g. a generic "dimension must be concrete") still passes. Add a substring assertion on the dimension name in one negative case.
There was a problem hiding this comment.
Added `TestCoordinateForErrorNamesDimension` (820eef3) — asserts the failure message contains `accelerator` for an `any`-accelerator input.
| - The API and UI that render the board (GP5). | ||
| - The live board / hosting / navigation host. | ||
| - The user-facing `docs/user/` TestGrid page. It is **deferred**: it has a soft | ||
| dependency on TG4a's served URL form, so it is not authored here. **#1272 stays |
There was a problem hiding this comment.
[Nitpick] Durable ADR embeds transient issue-lifecycle state ("#1272 stays open until that user-facing page lands"). Once #1272 closes, the ADR still asserts it's open. Keep the design record timeless and let the issue tracker own lifecycle — consistent with the repo's "don't reference transient state in durable artifacts" guidance.
Add pkg/recipe.CoordinateFor — the canonical, pure recipe→board-coordinate mapping (group=service, dashboard=accelerator-os, tab=intent[-platform]) that GP4/GP5/RQ1 and the TestGrid track import instead of re-deriving. Required dimensions fail closed on empty, "any", or a "/" path separator; platform is the only optional segment. Includes golden, format-lock, and CriteriaRegistry-union tests, plus ADR-012 documenting the taxonomy, opaque-identity URL form, pinned column-metadata schema, and ADR-009/#1224 reconciliation.
54c4447 to
820eef3
Compare
Re-review — feedback addressed ✅Re-reviewed at
One optional item intentionally left (Low, from the High-2 comment): LGTM — the cheap contract-strengthening fixes landed in this PR and the round-trippability work is correctly deferred and cross-linked to the first parsing consumer. No remaining blockers. |
PR #1409 (feat(recipe): shared coordinate mapping) is now merged. Replace the local coordinate.go implementation with a thin wrapper that delegates to recipe.CoordinateFor(*recipe.Criteria). RecipeCriteria is retained for the testgrid-specific K8sVersion and K8sConstraint fields which have no equivalent in recipe.Criteria. toRecipeCriteria() converts between the two for the coordinate call. The /"-in-dimension rejection guard added by pkg/recipe is now applied automatically; existing tests are unchanged since all test values are slash-free.
PR #1409 (feat(recipe): shared coordinate mapping) is now merged. Replace the local coordinate.go implementation with a thin wrapper that delegates to recipe.CoordinateFor(*recipe.Criteria). RecipeCriteria is retained for the testgrid-specific K8sVersion and K8sConstraint fields which have no equivalent in recipe.Criteria. toRecipeCriteria() converts between the two for the coordinate call. The /"-in-dimension rejection guard added by pkg/recipe is now applied automatically; existing tests are unchanged since all test values are slash-free.
PR #1409 (feat(recipe): shared coordinate mapping) is now merged. Replace the local coordinate.go implementation with a thin wrapper that delegates to recipe.CoordinateFor(*recipe.Criteria). RecipeCriteria is retained for the testgrid-specific K8sVersion and K8sConstraint fields which have no equivalent in recipe.Criteria. toRecipeCriteria() converts between the two for the coordinate call. The /"-in-dimension rejection guard added by pkg/recipe is now applied automatically; existing tests are unchanged since all test values are slash-free.
Summary
Adds
pkg/recipe.CoordinateFor— the canonical, pure recipe→board-coordinate mapping (group=service,dashboard=accelerator-os,tab=intent[-platform]) — plus its golden test and thedocs/design/012taxonomy spec that pins the mapping, stable URL form, column-metadata key schema, and the ADR-009 / #1224 reconciliation.Motivation / Context
This is TG6 — the Wave 0 linchpin of the evidence-dashboard tracks. Per the epic plan, one shared mapping function gates four workstreams: GP4 (#1404), GP5 (#1405), RQ1 (#1283), and the entire TestGrid track (#1266–#1273) all import
CoordinateForrather than re-deriving the coordinate. Landing it first lets those consumers build against a fixed contract instead of a stub.Scope is deliberately the function + spec. The user-facing
docs/user/TestGrid page is deferred (soft dependency on TG4a's served URL form, which doesn't exist yet), so #1272 stays open after this merges.Fixes: N/A
Related: #1272 (partial — docs/user page deferred), epic #1263, unblocks #1404 / #1405 / #1283
Type of Change
Component(s) Affected
pkg/recipe)docs/,examples/)Implementation Notes
CoordinateForis pure: no clock, no maps, no registry, no I/O — sameCriteriaalways yields the sameCoordinate. It consumes already-resolvedCriteriaand never parsesmetadata.name(the recipe name is accelerator-first; the coordinate is service-first — they stay independently correct).service/accelerator/os/intentare required; a nilCriteria, empty value, or the"any"wildcard returnsErrCodeInvalidRequestnaming the offending dimension. Platform is the only optional sub-segment — dropped when unset, neverunknown-substituted.CriteriaRegistryunion (not a closed enum), so a new const or--data-contributed value is covered without editing the table.pkg/health+docs/user/recipe-health.mdhave shipped (structural matrix), with the Evidence column a literalpendingplaceholder today — the coordinate deep-link is what RQ1 will wire.Testing
Coverage:
pkg/recipe/coordinate.go—CoordinateFor/Path/String/requireConcreteall 100%;pkg/recipepackage total ~86.8% (well above the 75% floor).Risk Assessment
Rollout notes: N/A — no consumers wired yet; this PR only publishes the contract.
Checklist
make testwith-race)make lint)git commit -S)