Skip to content

feat: agent step type#1681

Merged
yottahmd merged 12 commits intomainfrom
agent-type-step
Feb 15, 2026
Merged

feat: agent step type#1681
yottahmd merged 12 commits intomainfrom
agent-type-step

Conversation

@yottahmd
Copy link
Copy Markdown
Collaborator

@yottahmd yottahmd commented Feb 15, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Agent executor type now supported for workflow steps with configurable tools, memory, and bash policies.
    • New agent output tool enables structured task result capture.
    • Agent bash policy controls allow fine-grained command execution rules.
  • Improvements

    • Simplified command approval flow for agents with safety mode controls.
    • Enhanced UI robustness for DAG details rendering with improved optional data handling.
  • Configuration

    • Extended DAG schema with agent step configuration structures and executor type support.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 15, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR refactors the agent approval flow, introduces agent-based step execution as a new executor type, propagates configuration stores through the runtime context, and extends the DAG schema to support agent step configuration. The changes remove legacy policy-check tracking, add SafeMode propagation to tool execution, and implement a comprehensive agent step executor with tool management and policy enforcement.

Changes

Cohort / File(s) Summary
Approval Flow Refactoring
internal/agent/approval.go, internal/agent/bash.go, internal/agent/bash_test.go
Removed dangerous command approval runtime flows; simplified requestCommandApprovalWithOptions to requestCommandApproval; eliminated PolicyChecked tracking and associated SafeMode gating in bash execution; removed 168 lines of approval-related test coverage.
SafeMode & Tool Context
internal/agent/hooks.go, internal/agent/loop.go, internal/agent/types.go
Added SafeMode field to ToolExecInfo for hook-level safety decisions; removed PolicyChecked field from ToolContext; updated loop to propagate SafeMode into tool execution context.
New Output Tool
internal/agent/output.go
Introduced NewOutputTool(w io.Writer) constructor that builds an agent tool for writing final task results with JSON input validation and writer error handling.
Store Propagation & Context
internal/cmd/context.go, internal/cmd/dry.go, internal/cmd/restart.go, internal/cmd/start.go, internal/service/worker/remote_handler.go
Added agentStores() helper method to retrieve agent config and model stores; extended agent.Options with AgentConfigStore and AgentModelStore fields; wired stores through dry-run, restart, and start execution paths.
Execution Context Extensions
internal/core/exec/context.go, internal/runtime/context.go, internal/runtime/agent/agent.go
Added public context fields AgentConfigStore and AgentModelStore with corresponding ContextOption builders; propagated stores through agent initialization and runtime context.
DAG Schema & Agent Types
internal/cmn/schema/dag.schema.json, internal/core/step.go, internal/core/spec/step.go, internal/core/capabilities.go
Extended DAG schema with agent step config definitions (agentStepConfig, agentToolsConfig, agentBashPolicy, agentBashRule); added ExecutorTypeAgent constant; extended Step struct with Agent field; added SupportsAgent capability query.
Agent Step Executor
internal/runtime/builtin/agentstep/executor.go, internal/runtime/builtin/builtin.go
Implemented comprehensive agent step executor handling config/model store resolution, tool filtering by policy, system prompt assembly, LLM provider creation, message evaluation, bash policy enforcement, and loop cancellation with iteration limits.
Policy & Service Updates
internal/service/frontend/agent_policy.go, internal/service/frontend/agent_policy_test.go
Added early return path when SafeMode is disabled and bash denial behavior is AskUser (command allowed without prompting); updated tests to verify SafeMode-enabled and SafeMode-disabled scenarios.
UI Prop Updates
ui/src/features/dags/components/dag-details/DAGDetailsContent.tsx, ui/src/features/dags/components/dag-details/DAGHeader.tsx, ui/src/pages/dags/dag/index.tsx
Made currentDAGRun optional in component props; added optional chaining guards throughout DAGHeader (status, duration, running state, actions visibility); removed isDataReady guard in dag index page; rendering now depends solely on dagData.dag existence.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client/UI
    participant DAGExecutor as DAG Executor
    participant AgentStepExec as Agent Step<br/>Executor
    participant Runtime as Runtime<br/>Context
    participant Store as Agent Config/<br/>Model Store
    participant Agent as Agent<br/>Loop
    participant Tools as Tool<br/>Registry
    participant LLM as LLM<br/>Provider

    Client->>DAGExecutor: Execute DAG with agent step
    DAGExecutor->>Runtime: Create execution context<br/>with stores
    Runtime->>Store: Load agent config and<br/>model stores
    Store-->>Runtime: Return stores
    DAGExecutor->>AgentStepExec: Create agent step executor
    AgentStepExec->>Runtime: Resolve config/model<br/>stores from context
    Runtime-->>AgentStepExec: Provide stores
    AgentStepExec->>Store: Load agent config and<br/>derive tool policy
    Store-->>AgentStepExec: Agent config + policy
    AgentStepExec->>Tools: Build tool set filtered<br/>by policy & step config
    Tools-->>AgentStepExec: Filtered tools +<br/>output tool
    AgentStepExec->>LLM: Create LLM provider<br/>from model selection
    LLM-->>AgentStepExec: LLM provider ready
    AgentStepExec->>Agent: Initialize agent loop<br/>with tools & system prompt
    Agent->>Agent: Run agent loop with<br/>bash policy enforcement
    Agent->>Tools: Invoke tools with<br/>SafeMode in ToolExecInfo
    Tools-->>Agent: Tool results
    Agent->>Agent: Check iteration limit<br/>& completion
    Agent-->>AgentStepExec: Agent loop complete
    AgentStepExec-->>DAGExecutor: Step result
    DAGExecutor-->>Client: DAG execution result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • feat: Agent tool permission #1667 — Reverses agent approval wiring and refactors SafeMode/PolicyChecked flows, directly addressing the legacy approval mechanism changes in this PR.
  • feat: chat assistant #1615 — Modifies the same agent subsystem files (internal/agent/{bash,loop,types,approval}.go) and approval/SafeMode logic structures.
  • feat: default execution mode #1642 — Extends agent runtime initialization (internal/runtime/agent/agent.go Options and Agent wiring) with configuration store propagation, mirroring the store-threading pattern introduced here.
🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (35 files):

⚔️ go.mod (content)
⚔️ internal/agent/approval.go (content)
⚔️ internal/agent/bash.go (content)
⚔️ internal/agent/bash_test.go (content)
⚔️ internal/agent/hooks.go (content)
⚔️ internal/agent/loop.go (content)
⚔️ internal/agent/types.go (content)
⚔️ internal/cmd/context.go (content)
⚔️ internal/cmd/dry.go (content)
⚔️ internal/cmd/restart.go (content)
⚔️ internal/cmd/retry.go (content)
⚔️ internal/cmd/start.go (content)
⚔️ internal/cmn/config/cache.go (content)
⚔️ internal/cmn/config/cache_test.go (content)
⚔️ internal/cmn/fileutil/cache.go (content)
⚔️ internal/cmn/fileutil/cache_test.go (content)
⚔️ internal/cmn/schema/dag.schema.json (content)
⚔️ internal/core/capabilities.go (content)
⚔️ internal/core/exec/context.go (content)
⚔️ internal/core/spec/builder.go (content)
⚔️ internal/core/spec/dag.go (content)
⚔️ internal/core/spec/loader_test.go (content)
⚔️ internal/core/spec/step.go (content)
⚔️ internal/core/spec/types/env.go (content)
⚔️ internal/core/spec/types/env_test.go (content)
⚔️ internal/core/step.go (content)
⚔️ internal/runtime/agent/agent.go (content)
⚔️ internal/runtime/builtin/builtin.go (content)
⚔️ internal/runtime/context.go (content)
⚔️ internal/service/frontend/agent_policy.go (content)
⚔️ internal/service/frontend/agent_policy_test.go (content)
⚔️ internal/service/worker/remote_handler.go (content)
⚔️ ui/src/features/dags/components/dag-details/DAGDetailsContent.tsx (content)
⚔️ ui/src/features/dags/components/dag-details/DAGHeader.tsx (content)
⚔️ ui/src/pages/dags/dag/index.tsx (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: agent step type' accurately summarizes the main change: introducing a new agent step executor type to the workflow system, which is reflected across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 82.76% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch agent-type-step

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@internal/core/spec/step.go`:
- Around line 1567-1618: If a step includes an Agent config but the executor
type doesn't support agents, return a validation error instead of silently
ignoring it, and validate MaxIterations is > 0 when provided; update
buildStepAgent to check core.SupportsAgent(result.ExecutorConfig.Type) and if
false and s.Agent != nil return an error referencing the step (e.g., using step
/ result identifiers), and when s.Agent.MaxIterations != nil ensure
*s.Agent.MaxIterations > 0 otherwise return a clear validation error; keep
existing defaults in core.AgentStepConfig and only assign cfg.MaxIterations
after the positive-check passes.

In `@internal/runtime/builtin/agentstep/executor.go`:
- Around line 250-275: buildSystemPrompt currently always passes an empty
agent.MemoryContent to agent.GenerateSystemPrompt, so stepCfg.Memory is ignored;
update buildSystemPrompt to check stepCfg != nil && stepCfg.Memory != nil &&
stepCfg.Memory.Enabled, load/construct the MemoryContent from stepCfg.Memory
(using the project’s existing memory-loading helper or API for step memory), and
pass that populated agent.MemoryContent into agent.GenerateSystemPrompt (falling
back to an empty agent.MemoryContent when memory is disabled or loading fails);
keep references to buildSystemPrompt, stepCfg.Memory, the Enabled flag, and
agent.GenerateSystemPrompt to locate the change.
- Around line 156-180: The OnWorking handler currently increments iteration and
always calls cancelLoop() on the first idle, which causes maxIterations to be
ignored; update the OnWorking function in the loop config (agent.NewLoop ->
OnWorking) so that cancelLoop() is only called when iteration >= maxIterations
(i.e., move the cancelLoop() call inside the threshold branch) and remove the
unconditional cancelLoop() that runs after a single idle transition; keep the
existing logf(stderr, "Max iterations reached (%d)", maxIterations) and ensure
iteration is still incremented as shown.
- Around line 88-144: The step-level bash policy in stepCfg.Tools.BashPolicy is
never used—buildPolicyHook(globalPolicy) only enforces the global policy—so
update the hook registration to incorporate the step policy: either merge
stepCfg.Tools.BashPolicy into globalPolicy (when stepCfg != nil and
stepCfg.Tools != nil) before calling buildPolicyHook, or call buildPolicyHook
with a merged/combined policy object that respects step-level overrides; change
the call at hooks.OnBeforeToolExec(buildPolicyHook(globalPolicy)) to pass the
merged policy (referencing stepCfg, stepCfg.Tools.BashPolicy, globalPolicy,
buildPolicyHook, and hooks.OnBeforeToolExec).
🧹 Nitpick comments (7)
ui/src/features/dags/components/dag-details/DAGHeader.tsx (2)

70-70: Missing calculateDuration in useEffect dependency array.

calculateDuration is called inside this effect but not listed as a dependency. While the individual fields it closes over are partially covered, this is fragile — if calculateDuration's dependencies change in the future, this effect won't re-fire.

Proposed fix
-  }, [isRunning, dagRunToDisplay?.startedAt, dagRunToDisplay?.finishedAt]);
+  }, [isRunning, dagRunToDisplay?.startedAt, dagRunToDisplay?.finishedAt, calculateDuration]);

88-130: handleRefresh is not memoized, causing the keydown listener to be re-attached every render.

handleRefresh is a plain function (recreated each render) and is listed in the useEffect dependency array at Line 130. This means the keydown event listener is removed and re-added on every render cycle. Wrap it in useCallback.

Proposed fix
-  const handleRefresh = () => {
+  const handleRefresh = useCallback(() => {
     setIsRefreshing(true);
     refreshFn();
     setTimeout(() => setIsRefreshing(false), 600);
-  };
+  }, [refreshFn]);
internal/service/frontend/agent_policy.go (1)

76-80: Confirm that silently allowing denied commands when SafeMode is off is the intended security posture.

When SafeMode is false and the deny behavior is ask_user (the default), all policy-denied bash commands are silently allowed without any prompt or audit log entry. This means that agent steps running without SafeMode effectively bypass ask_user deny rules entirely.

Consider whether an audit log entry should still be emitted here for observability, even when the command is allowed:

🔍 Suggested audit logging
 	if decision.DenyBehavior == agent.BashDenyBehaviorAskUser && !info.SafeMode {
+		details["approval_result"] = "safe_mode_off_bypass"
+		logPolicyEvent(ctx, auditSvc, info, auditActionToolPolicyOverride, details)
 		return nil
 	}
internal/core/exec/context.go (1)

30-31: any-typed fields lose compile-time safety — consider defining minimal interfaces in this package.

The any typing is a valid workaround for circular imports, and the comments explain it well. However, every consumer must perform a runtime type assertion, which can silently produce nil values if the wrong type is stored. If more fields follow this pattern, consider defining narrow interfaces (e.g., AgentConfigProvider, AgentModelProvider) in the exec package to restore compile-time guarantees without importing the agent package.

Not blocking — the current approach is pragmatic for the scope of this PR.

internal/cmd/context.go (1)

412-432: Extract shared agentStores helper to avoid duplication.

The agentStores() implementation here is nearly identical to remoteTaskHandler.agentStores() in internal/service/worker/remote_handler.go—both use the same fileagentconfig.New and fileagentmodel.New calls with identical error handling, and both hardcode the "agent/models" sub-path. Additionally, internal/service/frontend/server.go uses the same pattern inline. Extract a shared helper (in the persistence layer or a small internal package) to centralize the path constant "agent/models" and keep store-creation logic consistent across callers.

internal/runtime/builtin/agentstep/executor.go (2)

209-247: Deterministic tool ordering would improve reproducibility.

Map iteration order is nondeterministic; using a fixed ordering avoids prompt/test flakiness.

🔧 Suggested ordering
-	// Default: all globally-enabled tools.
-	var tools []*agent.AgentTool
-	for _, tool := range allTools {
-		tools = append(tools, tool)
-	}
-	return tools
+	// Default: all globally-enabled tools in deterministic order.
+	order := []string{"bash", "read", "patch", "think", "read_schema", "web_search", "output"}
+	var tools []*agent.AgentTool
+	for _, name := range order {
+		if tool, ok := allTools[name]; ok {
+			tools = append(tools, tool)
+		}
+	}
+	return tools

295-335: Consider using structured logger helpers for agent logs.

logf/logMessage emit raw strings; routing through internal/common logging would keep consistent structure and fields.

As per coding guidelines, "Prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common".

Comment thread internal/core/spec/step.go
Comment thread internal/runtime/builtin/agentstep/executor.go
Comment thread internal/runtime/builtin/agentstep/executor.go Outdated
Comment thread internal/runtime/builtin/agentstep/executor.go
@yottahmd yottahmd merged commit 4de003c into main Feb 15, 2026
6 checks passed
@yottahmd yottahmd deleted the agent-type-step branch February 15, 2026 12:38
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 15, 2026

Codecov Report

❌ Patch coverage is 41.46341% with 312 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.91%. Comparing base (c5c1faf) to head (3e1496b).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
internal/runtime/builtin/agentstep/executor.go 3.28% 206 Missing ⚠️
internal/core/spec/step.go 14.81% 43 Missing and 3 partials ⚠️
internal/agent/output.go 0.00% 31 Missing ⚠️
internal/cmd/context.go 31.25% 7 Missing and 4 partials ⚠️
internal/service/worker/remote_handler.go 60.00% 5 Missing and 3 partials ⚠️
internal/agent/policy.go 80.95% 1 Missing and 3 partials ⚠️
internal/agent/iface/model.go 96.22% 1 Missing and 1 partial ⚠️
internal/persis/fileagentmodel/store.go 90.00% 2 Missing ⚠️
internal/agent/loop.go 50.00% 1 Missing ⚠️
internal/agent/session.go 66.66% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1681      +/-   ##
==========================================
- Coverage   70.35%   69.91%   -0.45%     
==========================================
  Files         351      354       +3     
  Lines       39209    39566     +357     
==========================================
+ Hits        27584    27661      +77     
- Misses       9422     9708     +286     
+ Partials     2203     2197       -6     
Files with missing lines Coverage Δ
internal/agent/api.go 68.69% <100.00%> (ø)
internal/agent/approval.go 0.00% <ø> (-58.63%) ⬇️
internal/agent/bash.go 100.00% <ø> (+1.80%) ⬆️
internal/agent/hooks.go 89.47% <ø> (-10.53%) ⬇️
internal/agent/presets.go 100.00% <100.00%> (ø)
internal/agent/provider_cache.go 85.00% <100.00%> (ø)
internal/agent/system_prompt.go 84.61% <100.00%> (ø)
internal/agent/types.go 100.00% <ø> (ø)
internal/cmd/dry.go 80.24% <100.00%> (+1.02%) ⬆️
internal/cmd/restart.go 59.84% <100.00%> (+1.25%) ⬆️
... and 22 more

... and 13 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5febca...3e1496b. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant