Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR refactors the agent approval flow, introduces agent-based step execution as a new executor type, propagates configuration stores through the runtime context, and extends the DAG schema to support agent step configuration. The changes remove legacy policy-check tracking, add SafeMode propagation to tool execution, and implement a comprehensive agent step executor with tool management and policy enforcement. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client/UI
participant DAGExecutor as DAG Executor
participant AgentStepExec as Agent Step<br/>Executor
participant Runtime as Runtime<br/>Context
participant Store as Agent Config/<br/>Model Store
participant Agent as Agent<br/>Loop
participant Tools as Tool<br/>Registry
participant LLM as LLM<br/>Provider
Client->>DAGExecutor: Execute DAG with agent step
DAGExecutor->>Runtime: Create execution context<br/>with stores
Runtime->>Store: Load agent config and<br/>model stores
Store-->>Runtime: Return stores
DAGExecutor->>AgentStepExec: Create agent step executor
AgentStepExec->>Runtime: Resolve config/model<br/>stores from context
Runtime-->>AgentStepExec: Provide stores
AgentStepExec->>Store: Load agent config and<br/>derive tool policy
Store-->>AgentStepExec: Agent config + policy
AgentStepExec->>Tools: Build tool set filtered<br/>by policy & step config
Tools-->>AgentStepExec: Filtered tools +<br/>output tool
AgentStepExec->>LLM: Create LLM provider<br/>from model selection
LLM-->>AgentStepExec: LLM provider ready
AgentStepExec->>Agent: Initialize agent loop<br/>with tools & system prompt
Agent->>Agent: Run agent loop with<br/>bash policy enforcement
Agent->>Tools: Invoke tools with<br/>SafeMode in ToolExecInfo
Tools-->>Agent: Tool results
Agent->>Agent: Check iteration limit<br/>& completion
Agent-->>AgentStepExec: Agent loop complete
AgentStepExec-->>DAGExecutor: Step result
DAGExecutor-->>Client: DAG execution result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@internal/core/spec/step.go`:
- Around line 1567-1618: If a step includes an Agent config but the executor
type doesn't support agents, return a validation error instead of silently
ignoring it, and validate MaxIterations is > 0 when provided; update
buildStepAgent to check core.SupportsAgent(result.ExecutorConfig.Type) and if
false and s.Agent != nil return an error referencing the step (e.g., using step
/ result identifiers), and when s.Agent.MaxIterations != nil ensure
*s.Agent.MaxIterations > 0 otherwise return a clear validation error; keep
existing defaults in core.AgentStepConfig and only assign cfg.MaxIterations
after the positive-check passes.
In `@internal/runtime/builtin/agentstep/executor.go`:
- Around line 250-275: buildSystemPrompt currently always passes an empty
agent.MemoryContent to agent.GenerateSystemPrompt, so stepCfg.Memory is ignored;
update buildSystemPrompt to check stepCfg != nil && stepCfg.Memory != nil &&
stepCfg.Memory.Enabled, load/construct the MemoryContent from stepCfg.Memory
(using the project’s existing memory-loading helper or API for step memory), and
pass that populated agent.MemoryContent into agent.GenerateSystemPrompt (falling
back to an empty agent.MemoryContent when memory is disabled or loading fails);
keep references to buildSystemPrompt, stepCfg.Memory, the Enabled flag, and
agent.GenerateSystemPrompt to locate the change.
- Around line 156-180: The OnWorking handler currently increments iteration and
always calls cancelLoop() on the first idle, which causes maxIterations to be
ignored; update the OnWorking function in the loop config (agent.NewLoop ->
OnWorking) so that cancelLoop() is only called when iteration >= maxIterations
(i.e., move the cancelLoop() call inside the threshold branch) and remove the
unconditional cancelLoop() that runs after a single idle transition; keep the
existing logf(stderr, "Max iterations reached (%d)", maxIterations) and ensure
iteration is still incremented as shown.
- Around line 88-144: The step-level bash policy in stepCfg.Tools.BashPolicy is
never used—buildPolicyHook(globalPolicy) only enforces the global policy—so
update the hook registration to incorporate the step policy: either merge
stepCfg.Tools.BashPolicy into globalPolicy (when stepCfg != nil and
stepCfg.Tools != nil) before calling buildPolicyHook, or call buildPolicyHook
with a merged/combined policy object that respects step-level overrides; change
the call at hooks.OnBeforeToolExec(buildPolicyHook(globalPolicy)) to pass the
merged policy (referencing stepCfg, stepCfg.Tools.BashPolicy, globalPolicy,
buildPolicyHook, and hooks.OnBeforeToolExec).
🧹 Nitpick comments (7)
ui/src/features/dags/components/dag-details/DAGHeader.tsx (2)
70-70: MissingcalculateDurationin useEffect dependency array.
calculateDurationis called inside this effect but not listed as a dependency. While the individual fields it closes over are partially covered, this is fragile — ifcalculateDuration's dependencies change in the future, this effect won't re-fire.Proposed fix
- }, [isRunning, dagRunToDisplay?.startedAt, dagRunToDisplay?.finishedAt]); + }, [isRunning, dagRunToDisplay?.startedAt, dagRunToDisplay?.finishedAt, calculateDuration]);
88-130:handleRefreshis not memoized, causing the keydown listener to be re-attached every render.
handleRefreshis a plain function (recreated each render) and is listed in theuseEffectdependency array at Line 130. This means thekeydownevent listener is removed and re-added on every render cycle. Wrap it inuseCallback.Proposed fix
- const handleRefresh = () => { + const handleRefresh = useCallback(() => { setIsRefreshing(true); refreshFn(); setTimeout(() => setIsRefreshing(false), 600); - }; + }, [refreshFn]);internal/service/frontend/agent_policy.go (1)
76-80: Confirm that silently allowing denied commands when SafeMode is off is the intended security posture.When
SafeModeisfalseand the deny behavior isask_user(the default), all policy-denied bash commands are silently allowed without any prompt or audit log entry. This means that agent steps running without SafeMode effectively bypassask_userdeny rules entirely.Consider whether an audit log entry should still be emitted here for observability, even when the command is allowed:
🔍 Suggested audit logging
if decision.DenyBehavior == agent.BashDenyBehaviorAskUser && !info.SafeMode { + details["approval_result"] = "safe_mode_off_bypass" + logPolicyEvent(ctx, auditSvc, info, auditActionToolPolicyOverride, details) return nil }internal/core/exec/context.go (1)
30-31:any-typed fields lose compile-time safety — consider defining minimal interfaces in this package.The
anytyping is a valid workaround for circular imports, and the comments explain it well. However, every consumer must perform a runtime type assertion, which can silently produce nil values if the wrong type is stored. If more fields follow this pattern, consider defining narrow interfaces (e.g.,AgentConfigProvider,AgentModelProvider) in theexecpackage to restore compile-time guarantees without importing the agent package.Not blocking — the current approach is pragmatic for the scope of this PR.
internal/cmd/context.go (1)
412-432: Extract sharedagentStoreshelper to avoid duplication.The
agentStores()implementation here is nearly identical toremoteTaskHandler.agentStores()ininternal/service/worker/remote_handler.go—both use the samefileagentconfig.Newandfileagentmodel.Newcalls with identical error handling, and both hardcode the"agent/models"sub-path. Additionally,internal/service/frontend/server.gouses the same pattern inline. Extract a shared helper (in the persistence layer or a small internal package) to centralize the path constant"agent/models"and keep store-creation logic consistent across callers.internal/runtime/builtin/agentstep/executor.go (2)
209-247: Deterministic tool ordering would improve reproducibility.Map iteration order is nondeterministic; using a fixed ordering avoids prompt/test flakiness.
🔧 Suggested ordering
- // Default: all globally-enabled tools. - var tools []*agent.AgentTool - for _, tool := range allTools { - tools = append(tools, tool) - } - return tools + // Default: all globally-enabled tools in deterministic order. + order := []string{"bash", "read", "patch", "think", "read_schema", "web_search", "output"} + var tools []*agent.AgentTool + for _, name := range order { + if tool, ok := allTools[name]; ok { + tools = append(tools, tool) + } + } + return tools
295-335: Consider using structured logger helpers for agent logs.
logf/logMessageemit raw strings; routing through internal/common logging would keep consistent structure and fields.As per coding guidelines, "Prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common".
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1681 +/- ##
==========================================
- Coverage 70.35% 69.91% -0.45%
==========================================
Files 351 354 +3
Lines 39209 39566 +357
==========================================
+ Hits 27584 27661 +77
- Misses 9422 9708 +286
+ Partials 2203 2197 -6
... and 13 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Summary by CodeRabbit
Release Notes
New Features
Improvements
Configuration