feat: replace object output with structured step outputs#2052
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughRefactors step outputs into structured entries (literal or sourced), adds persisted per-node Changes
Sequence DiagramsequenceDiagram
actor User
participant DAG as DAG / Step Config
participant Executor as Executor
participant Stdout as Stdout Capture
participant Stderr as Stderr Capture
participant FS as File System
participant Decoder as Decoder (text/json/yaml)
participant Selector as ResolveDataPath
participant Output as StructuredOutput JSON
User->>DAG: Define structured output (literal or from:{stdout,stderr,file}, decode, select)
DAG->>Executor: Start step
Executor->>Stdout: Capture stdout
Executor->>Stderr: Capture stderr
Executor->>FS: Read file (if from:file)
alt Literal value
DAG->>Output: Expand literal and include
else Sourced value
Stdout->>Decoder: Provide captured stdout (when used)
Stderr->>Decoder: Provide captured stderr (when used)
FS->>Decoder: Provide file contents (when used)
Decoder->>Selector: Decode then optionally select path via ResolveDataPath
Selector->>Output: Return fragment or full decoded value
end
Output-->>Executor: JSON-serialize structured map
Executor-->>DAG: Persist JSON as Node.OutputValue and expose via ${step.output.*}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
internal/runtime/node.go (1)
579-585: Duplicate logic for default max output size.The default value
1024 * 1024is duplicated here and inoutput.go(lines 144, 180). Consider extracting this to a constant or reusingdefaultMaxOutputSizefrominternal/core/dag.go(context snippet shows it exists there).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/node.go` around lines 579 - 585, The literal default 1024*1024 in maxOutputSize is duplicated; change maxOutputSize (in internal/runtime node.go) to use the shared default constant (defaultMaxOutputSize) instead of the hardcoded literal — import or reference the constant from internal/core/dag.go (where defaultMaxOutputSize is defined) so both this function and the uses in output.go reuse the single source of truth.internal/cmn/schema/dag_schema_test.go (1)
220-268: Consider adding negative test cases for structured output validation.The test covers valid configurations but lacks negative cases that verify schema rejection of invalid structured output configurations (e.g., invalid
fromvalues, missing required fields whenfromisfile). Other tests in this file (likeTestDAGSchemaParams) includewantErrcases for comprehensive validation coverage.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cmn/schema/dag_schema_test.go` around lines 220 - 268, TestDAGSchemaStepOutputObject currently only asserts valid specs; add negative test cases that assert resolved.Validate(doc) returns errors for invalid structured output definitions—e.g., a spec with output.version.from: invalid_source, one with from: file but missing required file path or pattern, and one with an invalid decode value—using the same pattern as existing table-driven tests (use mustParseYAMLDocument to build doc and check err via require.Error and require.Contains on err.Error()); keep tests inside TestDAGSchemaStepOutputObject and leverage mustResolveDAGSchema and resolved.Validate to locate the schema validation logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/cmn/schema/dag.schema.json`:
- Around line 1307-1313: The schema for the step "output" entry is too
permissive because the object branch sets additionalProperties: true; change it
to validate only the allowed structured keys by replacing additionalProperties:
true with additionalProperties: false and add an explicit "properties" object
listing allowed fields (e.g., "value", "from", "path", "decode", "select") and
their expected types/constraints, while keeping the alternative branches that
accept literal JSON values (string/number/boolean/null) so literal outputs still
validate; update the same "output" schema instance referenced by step.output.*
in dag.schema.json to use a oneOf/anyOf that permits primitives OR the
constrained structured object.
In `@internal/runtime/node.go`:
- Around line 496-511: Remove the TOCTOU-prone size check that uses
os.Stat/info.Size(): delete the branch that compares info.Size() >
maxOutputSize(ctx) (and its associated error return) while keeping the os.Stat
call only if you need to early-check existence/permissions; otherwise remove the
os.Stat call entirely and rely on os.ReadFile followed by the existing post-read
size check against maxOutputSize(ctx). Update error messages to use the existing
fmt.Errorf patterns referencing key and path and ensure the only
size-enforcement happens after reading (using maxOutputSize(ctx), os.ReadFile,
and len(data)).
---
Nitpick comments:
In `@internal/cmn/schema/dag_schema_test.go`:
- Around line 220-268: TestDAGSchemaStepOutputObject currently only asserts
valid specs; add negative test cases that assert resolved.Validate(doc) returns
errors for invalid structured output definitions—e.g., a spec with
output.version.from: invalid_source, one with from: file but missing required
file path or pattern, and one with an invalid decode value—using the same
pattern as existing table-driven tests (use mustParseYAMLDocument to build doc
and check err via require.Error and require.Contains on err.Error()); keep tests
inside TestDAGSchemaStepOutputObject and leverage mustResolveDAGSchema and
resolved.Validate to locate the schema validation logic.
In `@internal/runtime/node.go`:
- Around line 579-585: The literal default 1024*1024 in maxOutputSize is
duplicated; change maxOutputSize (in internal/runtime node.go) to use the shared
default constant (defaultMaxOutputSize) instead of the hardcoded literal —
import or reference the constant from internal/core/dag.go (where
defaultMaxOutputSize is defined) so both this function and the uses in output.go
reuse the single source of truth.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1520d12b-0b5d-4b04-8ec4-61352511b736
📒 Files selected for processing (27)
internal/cmn/eval/resolve_json.gointernal/cmn/eval/resolve_step.gointernal/cmn/eval/resolve_step_test.gointernal/cmn/schema/dag.schema.jsoninternal/cmn/schema/dag_schema_test.gointernal/core/exec/node.gointernal/core/spec/step.gointernal/core/spec/step_output_schema_test.gointernal/core/spec/step_output_test.gointernal/core/spec/step_test.gointernal/core/spec/step_types.gointernal/core/step.gointernal/intg/output_test.gointernal/intg/output_validation_e2e_test.gointernal/intg/stepref_test.gointernal/runtime/agent/agent.gointernal/runtime/agent/agent_test.gointernal/runtime/builtin/builtin.gointernal/runtime/builtin/noop/noop.gointernal/runtime/data.gointernal/runtime/node.gointernal/runtime/node_output_validation_edge_test.gointernal/runtime/node_output_validation_test.gointernal/runtime/node_structured_output_test.gointernal/runtime/output.gointernal/runtime/output_test.gointernal/runtime/transform/node.go
💤 Files with no reviewable changes (5)
- internal/core/spec/step_output_schema_test.go
- internal/runtime/node_output_validation_test.go
- internal/core/spec/step_test.go
- internal/runtime/node_output_validation_edge_test.go
- internal/intg/output_validation_e2e_test.go
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
internal/cmn/schema/dag_schema_test.go (1)
253-284: Tighten the negative assertions for the structured-output cases.All three failures only check for
"did not validate", so an unrelated schema error would still make these tests pass. Assert the relevant field/rule (from,path,decode) or validate the specific output-entry definition directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cmn/schema/dag_schema_test.go` around lines 253 - 284, The tests named RejectInvalidStructuredSource, RejectFileSourceWithoutPath, and RejectInvalidDecode currently only assert the generic "did not validate" message; tighten each to assert the specific failing field/rule so unrelated schema errors won't make them pass. Update the assertions in dag_schema_test.go for the cases (named above) to check the error string contains the relevant token ("from" for RejectInvalidStructuredSource, "path" for RejectFileSourceWithoutPath, and "decode" for RejectInvalidDecode) or validate the specific output-entry structure directly instead of the generic message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/core/spec/step.go`:
- Around line 1544-1555: The function shouldInferNoopStep currently treats any
step with structured output as eligible for inferring a noop, which incorrectly
matches stream-backed outputs like `{from: "stdout"}` and `{from: "stderr"}`;
update shouldInferNoopStep(result *core.Step, s *step) to return false when the
structured output is sourced from a stream by inspecting the step's
structured-output entries (e.g., iterate result.StructuredOutputs /
result.Outputs or use the accessor that backs HasStructuredOutput()) and if any
entry has From == "stdout" or From == "stderr" (or a Source/Type indicating a
stream) then bail out — keep the existing checks (ExecutorConfig,
Container/SubDAG/Parallel, nil s, command/exec/script) but add this additional
guard before returning true.
In `@internal/runtime/output.go`:
- Around line 167-179: The code clears oc.stderrOutputData (and resets
oc.stderrOutputCaptured) whenever a new stderr pipe is created, which loses
earlier attempts' stderr; to fix, do not reset oc.stderrOutputData (and avoid
clearing oc.stderrOutputCaptured) when assigning a new pipe in the
needStderrCapture branch—only initialize oc.maxOutputSize and
oc.stderrCapture/new pipe and call oc.stderrCapture.start(ctx,
oc.stderrOutputReader) while preserving any existing oc.stderrOutputData so
capturedStderr() can append across retries (refer to needStderrCapture,
oc.stderrOutputReader, oc.stderrOutputWriter, oc.stderrOutputData,
oc.stderrOutputCaptured, oc.maxOutputSize, and oc.stderrCapture.start).
---
Nitpick comments:
In `@internal/cmn/schema/dag_schema_test.go`:
- Around line 253-284: The tests named RejectInvalidStructuredSource,
RejectFileSourceWithoutPath, and RejectInvalidDecode currently only assert the
generic "did not validate" message; tighten each to assert the specific failing
field/rule so unrelated schema errors won't make them pass. Update the
assertions in dag_schema_test.go for the cases (named above) to check the error
string contains the relevant token ("from" for RejectInvalidStructuredSource,
"path" for RejectFileSourceWithoutPath, and "decode" for RejectInvalidDecode) or
validate the specific output-entry structure directly instead of the generic
message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8160b740-0255-4347-b50b-06b45cd33142
📒 Files selected for processing (13)
internal/cmn/schema/dag.schema.jsoninternal/cmn/schema/dag_schema_test.gointernal/core/spec/step.gointernal/intg/stepref_test.gointernal/runtime/agent/agent.gointernal/runtime/agent/collect_outputs_test.gointernal/runtime/condition.gointernal/runtime/data.gointernal/runtime/data_output_test.gointernal/runtime/node.gointernal/runtime/node_structured_output_test.gointernal/runtime/output.gointernal/runtime/output_limits.go
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/intg/stepref_test.go
- internal/runtime/agent/agent.go
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/runtime/node.go (1)
438-446: Rejectselectwhen decode istextor omitted.In the
""/textbranch,entry.Selectis ignored and the full trimmed string is returned. That makes a misconfigured entry look valid while publishing the wrong value. I'd fail fast here (or validate earlier) unlessselectis only allowed withjson/yaml.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/node.go` around lines 438 - 446, The branch handling text/default decode (when entry.Decode is "" or core.StepOutputDecodeText) must reject any non-empty entry.Select to fail fast; update the switch case that currently returns strings.TrimSpace(raw) so it first checks if entry.Select != "" and returns a clear error (e.g. "%s: select is not allowed with text decode") referencing entry.Decode and key, otherwise continue to return the trimmed string; this change should live alongside the existing cases and keep structured decoding via decodeStructuredOutputValue for JSON/YAML only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runtime/node.go`:
- Around line 450-463: readStructuredOutputSource currently feeds truncated
stream-backed payloads (from outputs.capturedOutput / outputs.capturedStderr)
into structured decoding, causing inconsistent behavior versus file-backed
sources; change outputs.capturedOutput and outputs.capturedStderr to propagate
an explicit overflow signal (e.g., return a sentinel error ErrOutputTruncated or
a boolean flag) when truncation to the configured max size occurred, and update
Node.readStructuredOutputSource to detect that signal and return the same
explicit "output exceeded maximum size limit" error for stdout/stderr sources
(instead of proceeding to parse or silently publishing truncated text), ensuring
parity with file source behavior; locate symbols
Node.readStructuredOutputSource, outputs.capturedOutput, outputs.capturedStderr
and the file-source error path to mirror its error text/behavior.
---
Nitpick comments:
In `@internal/runtime/node.go`:
- Around line 438-446: The branch handling text/default decode (when
entry.Decode is "" or core.StepOutputDecodeText) must reject any non-empty
entry.Select to fail fast; update the switch case that currently returns
strings.TrimSpace(raw) so it first checks if entry.Select != "" and returns a
clear error (e.g. "%s: select is not allowed with text decode") referencing
entry.Decode and key, otherwise continue to return the trimmed string; this
change should live alongside the existing cases and keep structured decoding via
decodeStructuredOutputValue for JSON/YAML only.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ad752138-41c1-483f-a12b-564429aa9c64
📒 Files selected for processing (2)
internal/runtime/env.gointernal/runtime/node.go
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/core/spec/step.go`:
- Around line 1555-1558: The noop inference currently checks raw s.Script and
can miss whitespace-only scripts; change the check to use the normalized/trimmed
script (the same normalization used by buildStepScript) so that a script of " "
is treated as empty—i.e., replace the raw comparison to s.Script == "" with a
trimmed/script-normalized value (use strings.TrimSpace(s.Script) or call the
same builder helper used elsewhere) in the noop inference path so that the
condition (s.Command == nil && s.Exec == nil && trimmedScript == "") correctly
detects no-op steps.
- Around line 1023-1033: The parser currently accepts core.StepOutputDecodeText
while the runtime only supports "" (default treated as JSON),
core.StepOutputDecodeJSON, and core.StepOutputDecodeYAML; update the validation
in the switch over entry.Decode to only allow "", core.StepOutputDecodeJSON, and
core.StepOutputDecodeYAML (remove core.StepOutputDecodeText), and change the
select check so entry.Select is permitted when entry.Decode is "" or
core.StepOutputDecodeJSON or core.StepOutputDecodeYAML (i.e., treat the empty
decode as JSON) to keep parsing consistent with internal/runtime/node.go
handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5cfc527b-3a10-4f6f-abff-5fae050d071e
📒 Files selected for processing (4)
internal/core/spec/step.gointernal/core/spec/step_output_test.gointernal/runtime/output.gointernal/runtime/output_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/core/spec/step_output_test.go
Summary
output:parsing with structured step-scoped outputs available through${step.output}and${step.output.foo}noopinference for structured outputs while keeping string-formoutput: NAMEfor run outputsWhy
output:metadata did not model the step-scoped data flow users actually need${step.output.*}the direct abstractionoutput: NAMEunchanged preserves finaloutputs.jsoncollection for run summaries and API/UI consumersValidation
go test ./internal/core/spec -count=1go test ./internal/cmn/eval -count=1go test ./internal/cmn/schema -count=1go test ./internal/core/exec -count=1go test ./internal/runtime ./internal/runtime/agent -count=1go test ./internal/runtime/transform -count=1go test ./internal/intg -run 'TestStepScopedOutputAccess|TestStepIDPropertyAccess|TestOutput' -count=1Summary by CodeRabbit
New Features
Removed Features