feat(measurement): NetworkTopology type + Subtype.Items for cross-repo l8k integration#1417
Conversation
📝 WalkthroughWalkthroughThis PR introduces Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/aicr/v1/server.yaml (1)
1323-1335: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Subtypefield name mismatches the runtime wire contract (namevssubtype).
pkg/measurement/types.goserializes this field assubtype, but the OpenAPI schema requires and definesname. This creates client/server contract drift for external producers and generated SDKs.Suggested OpenAPI fix
Subtype: type: object @@ - required: [name] + required: [subtype] properties: - name: + subtype: type: string description: Subtype identifier (e.g., service name, configuration category) example: containerd.service🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/aicr/v1/server.yaml` around lines 1323 - 1335, The Subtype object schema definition has a field name mismatch with the runtime implementation. The OpenAPI schema currently defines the property as `name`, but the actual Go serialization in pkg/measurement/types.go serializes this field as `subtype`. Update the property name in the Subtype object from `name` to `subtype` and adjust the description and example accordingly to ensure the schema matches the actual wire contract used by the runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/integrator/measurement-api.md`:
- Around line 171-172: The documentation contains contradictory semantics:
"First match wins" contradicts the statement that "Multiple matches return
ErrCodeConflict". Replace these two bullet points with explicit deterministic
single-match semantics that clearly state exactly one match is required for
success, multiple matches result in ErrCodeConflict, and no matches result in
ErrCodeNotFound. Remove the "First match wins" language and ensure the
documentation unambiguously describes the required behavior for all three
outcomes (single match success, multiple matches error, and no matches error).
- Around line 191-193: The "See also" section in the measurement-api.md file
contains a broken link reference to `../contributor/data.md` which does not
exist and is causing the docs CI to fail. Locate the link in the "See also"
section (the second bullet point containing the reference to contributor data
model) and update the file path to point to the correct existing file location
in the docs/contributor directory, ensuring the relative path accurately
references the actual data model documentation file.
- Around line 154-157: The fenced code block in the measurement-api.md file is
missing a language identifier, which triggers markdownlint rule MD040. Add the
language tag "text" to the opening fence of the code block that contains the
pattern examples for Type, Subtype, and Key (the block showing legacy form and
item form). Change the opening fence from three backticks to three backticks
followed by "text".
In `@pkg/constraints/extractor_test.go`:
- Line 282: The struct field definition containing "name string" at
line 282 in the extractor_test.go file is not properly formatted according to
gofmt standards. Run gofmt formatting tool on the file to automatically fix the
whitespace and formatting issues. You can use the command `gofmt -w
pkg/constraints/extractor_test.go` to reformat the file, or run the full linter
check with `golangci-lint run -c .golangci.yaml pkg/constraints/` as recommended
in the coding guidelines, which will ensure all formatting issues in the
affected package are resolved before pushing the PR.
In `@pkg/constraints/extractor.go`:
- Line 57: The code has multiple lint violations that need to be fixed. First,
correct the misspelling on line 57 by changing `behaviour` to `behavior` to
match Go conventions. Second, address the goconst violations by extracting the
repeated string literal `"path"` (appearing at line 77 and elsewhere) into a
named constant at the package level. Third, extract the repeated string literal
`"selector"` (appearing at line 185 and elsewhere) into a separate named
constant. Finally, run gofmt on the file to fix the formatting issue flagged on
line 292. After making these changes, verify by running `golangci-lint run -c
.golangci.yaml` on the pkg/constraints package to ensure all lint violations are
resolved.
In `@pkg/measurement/doc.go`:
- Around line 16-38: The file pkg/measurement/doc.go fails gofmt formatting
validation in CI. Run the gofmt formatting tool on this file to correct the
formatting issues, particularly around the documentation block containing the
Core Types section. After running gofmt to automatically format the file, verify
that all syntax and spacing align with Go's standard formatting conventions.
Additionally, as per the coding guidelines, run golangci-lint with the project
configuration before pushing the PR to catch any other formatting or linting
issues in the affected package.
---
Outside diff comments:
In `@api/aicr/v1/server.yaml`:
- Around line 1323-1335: The Subtype object schema definition has a field name
mismatch with the runtime implementation. The OpenAPI schema currently defines
the property as `name`, but the actual Go serialization in
pkg/measurement/types.go serializes this field as `subtype`. Update the property
name in the Subtype object from `name` to `subtype` and adjust the description
and example accordingly to ensure the schema matches the actual wire contract
used by the runtime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 30c1788d-690c-4134-8fe7-2caaf32adf3c
📒 Files selected for processing (10)
api/aicr/v1/server.yamldocs/integrator/index.mddocs/integrator/measurement-api.mdpkg/constraints/extractor.gopkg/constraints/extractor_test.gopkg/measurement/builder.gopkg/measurement/builder_test.gopkg/measurement/doc.gopkg/measurement/types.gopkg/measurement/types_test.go
6888374 to
7d68e93
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
docs/integrator/measurement-api.md (1)
154-157: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick winAdd a language tag to the fenced code block.
This block is missing a language identifier and is triggering markdownlint MD040.
Suggested fix
-``` +```text {Type}.{Subtype}.{Key} # legacy form, looks in Subtype.Data {Type}.{Subtype}[<selector>].{Key} # item form, looks in ItemEntry</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@docs/integrator/measurement-api.mdaround lines 154 - 157, In the
measurement-api.md file, locate the fenced code block containing the text about
{Type}.{Subtype}.{Key} and {Type}.{Subtype}[].{Key} around lines
154-157. Add a language identifier "text" after the opening triple backticks
(changetotext) to satisfy the markdownlint MD040 rule that requires
language tags on fenced code blocks.</details> <!-- cr-comment:v1:7bf615118a480a715ddc5c3f --> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.Duplicate comments:
In@docs/integrator/measurement-api.md:
- Around line 154-157: In the measurement-api.md file, locate the fenced code
block containing the text about {Type}.{Subtype}.{Key} and
{Type}.{Subtype}[].{Key} around lines 154-157. Add a language
identifier "text" after the opening triple backticks (changetotext) to
satisfy the markdownlint MD040 rule that requires language tags on fenced code
blocks.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Path: .coderabbit.yaml **Review profile**: ASSERTIVE **Plan**: Enterprise **Run ID**: `9e90389b-4553-4fe4-ad0d-5fe9c2c85185` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 6888374107af9c1c3782b52a3764eb21b6973561 and 7d68e930b71045666a12306f4caa24529e64577b. </details> <details> <summary>📒 Files selected for processing (10)</summary> * `api/aicr/v1/server.yaml` * `docs/integrator/index.md` * `docs/integrator/measurement-api.md` * `pkg/constraints/extractor.go` * `pkg/constraints/extractor_test.go` * `pkg/measurement/builder.go` * `pkg/measurement/builder_test.go` * `pkg/measurement/doc.go` * `pkg/measurement/types.go` * `pkg/measurement/types_test.go` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
mchmarny
left a comment
There was a problem hiding this comment.
Clean, well-scoped Phase 1 — schema affordances only, no collector/CLI wiring, exactly the shape described. The Items model, the constraint-path selector grammar, and the contract doc are all coherent and the test coverage is thorough (parser forms + every error path, index/predicate extraction, JSON+YAML round-trip, and a byte-identity check that omitempty keeps existing snapshots unchanged).
A couple of correctness points I checked and am satisfied with: predicate string-matching (Reading.String() == value) is robust across YAML and JSON sources since %v on float64(3.0) (how JSON delivers integers) prints 3, so pfs[rail=3] resolves regardless of source format; and ambiguous predicate → ErrCodeConflict / negative index / unclosed bracket all fail closed, which is the right call.
Only two minor nits inline, neither blocking. Content LGTM.
Note: this isn't approvable as-is on CI — Lint and the gate check (unsigned commit) are both red. Worth a make qualify pass and a git commit -S --amend (+ squash to one signed commit) before merge; both are mechanics, not content.
3ccdafb to
e7ce60a
Compare
…o l8k integration
aicr models cluster state as a Snapshot of `Measurement`s (K8s, GPU, OS,
SystemD, NodeTopology). NodeTopology is cluster-wide and only carries node
taints + labels. The model has no slot for per-hardware-group network
topology -- PFs and their PCI addresses, rails, NUMA, RDMA capability,
link type, kernel module lists, machine/GPU identity -- which makes it
impossible to recipe or validate networking beyond a single all-or-nothing
network-operator toggle.
k8s-launch-kit (l8k) already discovers exactly this data via its
`cluster-config.yaml`. The intended integration is library-mode: l8k
imports `pkg/measurement` and emits `*Measurement` values directly that
aicr Snapshots can consume. The dependency direction is l8k -> aicr; the
Measurement schema is aicr's contract.
This commit is Phase 1 -- the schema affordances l8k needs to implement
that contract. No l8k import is introduced here; the snapshotter and CLI
are untouched. The producer side (Phase 2, in the l8k repo) and the
consumer side (Phase 3, the aicr collector + CLI flag) land separately.
Schema additions to `pkg/measurement`:
- New `TypeNetworkTopology` enum value. Today at most one Measurement
of this Type per snapshot, keeping all find-first-by-Type consumers
(constraints extractor, recipe validation, diff indexing, fingerprint)
correct. Multi-instance support is a deliberate future step.
- New `Subtype.Items []ItemEntry` field. `ItemEntry` mirrors
Subtype's scalar contract: `Data map[string]Reading` for scalar
measurements plus `Context map[string]string` for descriptive
metadata. This lets homogeneous record lists (per-PF entries under a
`pfs` subtype) be represented as real YAML arrays rather than
packed-string-encoded scalars or numbered subtypes (`pf-0`,
`pf-1`, ...). `Reading` itself is NOT extended -- it stays
scalar-only.
- Custom `UnmarshalJSON`/`UnmarshalYAML` on `ItemEntry` so `Reading`
values inside `ItemEntry.Data` round-trip the same way they do in
`Subtype.Data` today.
- `Subtype.Validate` relaxed: a subtype is valid with either non-empty
`Data` or non-empty `Items` (or both). The existing
`len(Data) > 0` invariant predated Items.
- `Items` carries `omitempty` so subtypes that don't use it serialize
byte-identically to today's output. `Data`'s tag also gains
`omitempty` so Items-only subtypes don't emit a noisy `data: {}`;
this is behavior-safe because every existing subtype satisfies
`len(Data) > 0` (Validate guaranteed it).
- New SubtypeBuilder helpers: `WithContext(k, v)`,
`WithContextMap(map)`, `WithItem(entry)`, `WithItems(slice)`.
Constraint path grammar extension in `pkg/constraints`:
- New optional item-selector segment:
`{Type}.{Subtype}[<selector>].{Key}`. Two selector forms:
Index: `NetworkTopology.pfs[0].rail`
Predicate: `NetworkTopology.pfs[rail=3].pciAddress`
Predicates evaluate against `ItemEntry.Data` first, then
`ItemEntry.Context`. First match wins; multiple matches return
`ErrCodeConflict`. Key resolution inside the matched entry tries
`Data` then `Context`. Paths without a selector continue to use the
legacy `Subtype.Data` lookup unchanged.
OpenAPI (`api/aicr/v1/server.yaml`):
- `Measurement.type` enum corrected from the stale
`[systemd, os, k8s, gpu]` to match `measurement.Types`:
`[K8s, GPU, OS, SystemD, NodeTopology, NetworkTopology]`. The
pre-existing miss (`NodeTopology` and `SystemD` were never added) is
bundled into this fix.
- New `ItemEntry` schema mirroring `data` + `context`.
- `Subtype` gains an optional `items` array property; `data` is no
longer required at the schema level (Validate still enforces "at
least one of data/items").
Docs:
- New `docs/integrator/measurement-api.md` publishing the cross-repo
Measurement schema contract: Type cardinality table, Subtype layout
rules, the NetworkTopology shape (subtype names, what lives in
Context vs Data, item structure for `pfs`), and the constraint-path
grammar including selectors. l8k's Phase 2 implementation targets
this spec.
- `pkg/measurement/doc.go` updated to flag the package as part of
aicr's cross-repo public API surface.
- `docs/integrator/index.md` adds the new page to the integrator TOC.
- `pkg/measurement/types_test.go`: Items-only subtype validation,
JSON + YAML round-trip of Items with Reading values inside ItemEntry,
byte-identity check that pre-existing Items-less subtypes don't
suddenly emit an `items:` key.
- `pkg/measurement/builder_test.go`: 100% coverage on each new builder
helper (WithContext / WithContextMap / WithItem / WithItems).
- `pkg/constraints/extractor_test.go`: parser tests for both selector
forms and every error path; ExtractValue tests for index/predicate
hits, out-of-bounds, no-match, no-items, ambiguous-predicate
(ErrCodeConflict).
Coverage on touched packages vs main:
pkg/measurement: 96.0% -> 96.6% (+0.6%)
pkg/constraints: 80.9% -> 87.6% (+6.7%)
`go test ./...` passes (the pre-existing `pkg/trust` sandbox failure is
unrelated -- it writes to `~/.sigstore` which is outside this
environment's write allowlist; passes outside the sandbox). `go vet ./...`
clean. `golangci-lint` not available in this environment per project
notes -- run `make qualify` before pushing.
Phase 2 (in the l8k repo): embed l8k's discovery-only runtime artifacts
(presets, default config; profiles deferred), rename
`LaunchKubernetesConfig` -> `LaunchKitConfig`, add a library-friendly
`Discover(ctx, kubeClient)` entry point, vendor aicr's `pkg/measurement`,
and add `DiscoverMeasurements` / `ParseClusterConfigMeasurements`
wrappers that emit `*aicrmeasurement.Measurement` values per the contract
this commit publishes.
Phase 3 (in aicr): new `pkg/collector/network/` calling l8k's library,
`--cluster-config` / `--discover-network` flags on `aicr snapshot`, RBAC
additions for the agent Job, single-group enforcement.
Multi-group `NetworkTopology` is a future iteration; the consumer rewrites
it requires (constraint @<id> grammar, recipe-validator iteration, diff
(Type, identifier) keying, fingerprint group-aware logic) are deliberately
deferred until the single-group integration has shaken out.
Signed-off-by: Alexander Maslennikov <[email protected]>
e7ce60a to
8220b20
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/aicr/v1/server.yaml (1)
1325-1353: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winEnforce the documented
data/itemsinvariant in schema, not only in prose.Line 1326 states a subtype must include
dataoritems, but the schema currently acceptssubtype-only payloads. That creates a runtime-vs-OpenAPI contract drift for client validation/codegen.Suggested schema fix
Subtype: type: object description: | A subtype within a measurement. A subtype must carry at least one entry in `data` or `items`. `items` is used when the subtype payload is naturally a list of structured records (e.g. per-PF entries under a NetworkTopology `pfs` subtype). required: [subtype] + anyOf: + - required: [data] + - required: [items] properties: subtype: type: string description: Subtype identifier (e.g., service name, configuration category) example: containerd.service data: type: object + minProperties: 1 additionalProperties: $ref: "`#/components/schemas/Reading`" description: Key-value pairs of configuration settings context: type: object additionalProperties: true description: Optional metadata about the source and collection method nullable: true items: type: array + minItems: 1 items: $ref: "`#/components/schemas/ItemEntry`" description: | Optional ordered list of structured records. Independent of `data` — a subtype may carry either or both.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/aicr/v1/server.yaml` around lines 1325 - 1353, The OpenAPI schema for the subtype object only enforces the required `subtype` field but does not validate that at least one of `data` or `items` must be present, contradicting the documented requirement in the description. Add schema-level validation (such as oneOf or anyOf constraints) to enforce that payloads matching this schema must include either `data` or `items` (or both) in addition to the required `subtype` field, ensuring the schema enforcement matches the documented contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/constraints/extractor_test.go`:
- Around line 477-511: The test struct for the ExtractValue test cases only
validates that an error occurs but does not verify the specific error code
returned. Add a new field to the test struct definition (alongside name, path,
want, and expectError) to specify the expected error code for each error case.
Then, in the error assertion block where it checks `if tt.expectError`, add a
validation using the pkg/errors test helper to assert that the returned error
matches the expected error code (e.g., ErrCodeNotFound for cases like "index out
of bounds", "predicate no match", and "key not in item", and ErrCodeConflict or
appropriate code for "selector on subtype with no items"). This ensures that
ExtractValue returns the correct structured error code for each failure
scenario, not just any error.
---
Outside diff comments:
In `@api/aicr/v1/server.yaml`:
- Around line 1325-1353: The OpenAPI schema for the subtype object only enforces
the required `subtype` field but does not validate that at least one of `data`
or `items` must be present, contradicting the documented requirement in the
description. Add schema-level validation (such as oneOf or anyOf constraints) to
enforce that payloads matching this schema must include either `data` or `items`
(or both) in addition to the required `subtype` field, ensuring the schema
enforcement matches the documented contract.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 9a922bcb-6289-4a20-873d-2523b1d253c7
📒 Files selected for processing (11)
api/aicr/v1/server.yamldocs/integrator/index.mddocs/integrator/measurement-api.mdpkg/constraints/extractor.gopkg/constraints/extractor_test.gopkg/fingerprint/from_measurements.gopkg/measurement/builder.gopkg/measurement/builder_test.gopkg/measurement/doc.gopkg/measurement/types.gopkg/measurement/types_test.go
| tests := []struct { | ||
| name string | ||
| path string | ||
| want string | ||
| expectError bool | ||
| }{ | ||
| // Index selectors | ||
| {name: "index 0 data field", path: "NetworkTopology.pfs[0].rail", want: "0"}, | ||
| {name: "index 1 data field", path: "NetworkTopology.pfs[1].rail", want: "1"}, | ||
| {name: "index 0 context field", path: "NetworkTopology.pfs[0].pciAddress", want: "0000:03:00.0"}, | ||
| {name: "index 2 rdmaDevice", path: "NetworkTopology.pfs[2].rdmaDevice", want: "mlx5_2"}, | ||
| // Predicate selectors | ||
| {name: "predicate by data field", path: "NetworkTopology.pfs[rail=1].pciAddress", want: "0000:03:00.1"}, | ||
| {name: "predicate by context field", path: "NetworkTopology.pfs[pciAddress=0000:03:00.2].rail", want: "2"}, | ||
| // Backward compat: non-item path still works | ||
| {name: "non-item path data", path: "NetworkTopology.capabilities.sriov", want: "true"}, | ||
|
|
||
| // Errors | ||
| {name: "index out of bounds", path: "NetworkTopology.pfs[99].rail", expectError: true}, | ||
| {name: "predicate no match", path: "NetworkTopology.pfs[rail=99].pciAddress", expectError: true}, | ||
| {name: "key not in item", path: "NetworkTopology.pfs[0].nonexistent", expectError: true}, | ||
| {name: "selector on subtype with no items", path: "NetworkTopology.capabilities[0].sriov", expectError: true}, | ||
| } | ||
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| t.Parallel() | ||
|
|
||
| path := mustParse(t, tt.path) | ||
| got, err := path.ExtractValue(snap) | ||
| if tt.expectError { | ||
| if err == nil { | ||
| t.Errorf("expected error, got nil; result=%q", got) | ||
| } | ||
| return |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win
Assert the structured error code, not just that an error exists.
The selector contract distinguishes not-found cases from ambiguous predicate conflicts, but these tests would still pass if ExtractValue returned the wrong structured code. Add expected codes for error cases, especially ErrCodeNotFound vs ErrCodeConflict, and assert them with the existing pkg/errors test helper/pattern.
Also applies to: 543-550
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/constraints/extractor_test.go` around lines 477 - 511, The test struct
for the ExtractValue test cases only validates that an error occurs but does not
verify the specific error code returned. Add a new field to the test struct
definition (alongside name, path, want, and expectError) to specify the expected
error code for each error case. Then, in the error assertion block where it
checks `if tt.expectError`, add a validation using the pkg/errors test helper to
assert that the returned error matches the expected error code (e.g.,
ErrCodeNotFound for cases like "index out of bounds", "predicate no match", and
"key not in item", and ErrCodeConflict or appropriate code for "selector on
subtype with no items"). This ensures that ExtractValue returns the correct
structured error code for each failure scenario, not just any error.
…
Why
aicr models cluster state as a Snapshot of
Measurements (K8s, GPU, OS, SystemD, NodeTopology). NodeTopology is cluster-wide and only carries node taints + labels. The model has no slot for per-hardware-group network topology -- PFs and their PCI addresses, rails, NUMA, RDMA capability, link type, kernel module lists, machine/GPU identity -- which makes it impossible to recipe or validate networking beyond a single all-or-nothing network-operator toggle.k8s-launch-kit (l8k) already discovers exactly this data via its
cluster-config.yaml. The intended integration is library-mode: l8k importspkg/measurementand emits*Measurementvalues directly that aicr Snapshots can consume. The dependency direction is l8k -> aicr; the Measurement schema is aicr's contract.This commit is Phase 1 -- the schema affordances l8k needs to implement that contract. No l8k import is introduced here; the snapshotter and CLI are untouched. The producer side (Phase 2, in the l8k repo) and the consumer side (Phase 3, the aicr collector + CLI flag) land separately.
Part of #828
What changes
Schema additions to
pkg/measurement:New
TypeNetworkTopologyenum value. Today at most one Measurement of this Type per snapshot, keeping all find-first-by-Type consumers (constraints extractor, recipe validation, diff indexing, fingerprint) correct. Multi-instance support is a deliberate future step.New
Subtype.Items []ItemEntryfield.ItemEntrymirrors Subtype's scalar contract:Data map[string]Readingfor scalar measurements plusContext map[string]stringfor descriptive metadata. This lets homogeneous record lists (per-PF entries under apfssubtype) be represented as real YAML arrays rather than packed-string-encoded scalars or numbered subtypes (pf-0,pf-1, ...).Readingitself is NOT extended -- it stays scalar-only.Custom
UnmarshalJSON/UnmarshalYAMLonItemEntrysoReadingvalues insideItemEntry.Dataround-trip the same way they do inSubtype.Datatoday.Subtype.Validaterelaxed: a subtype is valid with either non-emptyDataor non-emptyItems(or both). The existinglen(Data) > 0invariant predated Items.Itemscarriesomitemptyso subtypes that don't use it serialize byte-identically to today's output.Data's tag also gainsomitemptyso Items-only subtypes don't emit a noisydata: {}; this is behavior-safe because every existing subtype satisfieslen(Data) > 0(Validate guaranteed it).New SubtypeBuilder helpers:
WithContext(k, v),WithContextMap(map),WithItem(entry),WithItems(slice).Constraint path grammar extension in
pkg/constraints:{Type}.{Subtype}[<selector>].{Key}. Two selector forms:Index:
NetworkTopology.pfs[0].railPredicate:
NetworkTopology.pfs[rail=3].pciAddressPredicates evaluate against
ItemEntry.Datafirst, thenItemEntry.Context. First match wins; multiple matches returnErrCodeConflict. Key resolution inside the matched entry triesDatathenContext. Paths without a selector continue to use thelegacy
Subtype.Datalookup unchanged.OpenAPI (
api/aicr/v1/server.yaml):Measurement.typeenum corrected from the stale[systemd, os, k8s, gpu]to matchmeasurement.Types:[K8s, GPU, OS, SystemD, NodeTopology, NetworkTopology]. The pre-existing miss (NodeTopologyandSystemDwere never added) is bundled into this fix.ItemEntryschema mirroringdata+context.Subtypegains an optionalitemsarray property;datais no longer required at the schema level (Validate still enforces "at least one of data/items").Docs:
docs/integrator/measurement-api.mdpublishing the cross-repo Measurement schema contract: Type cardinality table, Subtype layout rules, the NetworkTopology shape (subtype names, what lives in Context vs Data, item structure forpfs), and the constraint-path grammar including selectors. l8k's Phase 2 implementation targets this spec.pkg/measurement/doc.goupdated to flag the package as part of aicr's cross-repo public API surface.docs/integrator/index.mdadds the new page to the integrator TOC.Testing
pkg/measurement/types_test.go: Items-only subtype validation, JSON + YAML round-trip of Items with Reading values inside ItemEntry, byte-identity check that pre-existing Items-less subtypes don't suddenly emit anitems:key.pkg/measurement/builder_test.go: 100% coverage on each new builder helper (WithContext / WithContextMap / WithItem / WithItems).pkg/constraints/extractor_test.go: parser tests for both selector forms and every error path; ExtractValue tests for index/predicate hits, out-of-bounds, no-match, no-items, ambiguous-predicate (ErrCodeConflict).Coverage on touched packages vs main:
pkg/measurement: 96.0% -> 96.6% (+0.6%)
pkg/constraints: 80.9% -> 87.6% (+6.7%)
go test ./...passes (the pre-existingpkg/trustsandbox failure is unrelated -- it writes to~/.sigstorewhich is outside this environment's write allowlist; passes outside the sandbox).go vet ./...clean.golangci-lintnot available in this environment per project notes -- runmake qualifybefore pushing.Not in this commit
Phase 2 (in the l8k repo): embed l8k's discovery-only runtime artifacts (presets, default config; profiles deferred), rename
LaunchKubernetesConfig->LaunchKitConfig, add a library-friendlyDiscover(ctx, kubeClient)entry point, vendor aicr'spkg/measurement, and addDiscoverMeasurements/ParseClusterConfigMeasurementswrappers that emit*aicrmeasurement.Measurementvalues per the contract this commit publishes.Phase 3 (in aicr): new
pkg/collector/network/calling l8k's library,--cluster-config/--discover-networkflags onaicr snapshot, RBAC additions for the agent Job, single-group enforcement.Multi-group
NetworkTopologyis a future iteration; the consumer rewrites it requires (constraint @ grammar, recipe-validator iteration, diff (Type, identifier) keying, fingerprint group-aware logic) are deliberately deferred until the single-group integration has shaken out.Summary
Motivation / Context
Fixes:
Related:
Type of Change
Component(s) Affected
cmd/aicr,pkg/cli)cmd/aicrd,pkg/server)pkg/recipe)pkg/bundler,pkg/component/*)pkg/collector,pkg/snapshotter)pkg/validator)pkg/errors,pkg/k8s)docs/,examples/)Implementation Notes
Testing
# Commands run (prefer `make qualify` for non-trivial changes) make qualifyRisk Assessment
Rollout notes:
Checklist
make testwith-race)make lint)git commit -S) — GPG signing info