Skip to content

refactor: Avoid synthetic prompt for approval continuations#2598

Merged
daryllimyt merged 17 commits intomainfrom
feat/cleanup-approval-continuation
May 3, 2026
Merged

refactor: Avoid synthetic prompt for approval continuations#2598
daryllimyt merged 17 commits intomainfrom
feat/cleanup-approval-continuation

Conversation

@daryllimyt
Copy link
Copy Markdown
Contributor

@daryllimyt daryllimyt commented May 1, 2026

Summary

  • replace visible synthetic approval continuation prompts with a hidden Claude Code isMeta continuation tick
  • preseed/reconcile Claude-format user/tool_result history rows so the model sees the real approved or denied tool result
  • preserve legitimate thinking from the resumed assistant turn while hiding only SDK continuation metadata and synthetic artifacts
  • replace legacy approval interrupt placeholders before duplicate detection so existing pending approvals can reconcile correctly
  • update workflow, runtime, session activity, socket, adapter, unit, Temporal, and integration coverage for approval continuation ordering

Existing Pending Approvals

This should be safe for workflows that are currently waiting on approval. Those workflows are paused before post-approval execution at await self.approvals.wait(). After the approval signal arrives, the new code runs the post-approval path: execute approved/denied tools, reconcile the session history by inserting the real user/tool_result row, reload the session history, then resume Claude Code with the hidden meta continuation tick.

Existing pending approvals should already have the assistant tool_use row and Claude Code interrupt artifacts in session history. The reconciliation step is designed for that state: find the matching assistant tool_use, remove interrupt artifacts including legacy "interrupted" placeholders, and insert the real tool result immediately after the tool use.

Caveats:

  • Workflows that already received approval and are mid-continuation under the old behavior may still be in a bad/stuck state and are not guaranteed to self-heal.
  • If session history is malformed or missing the matching assistant tool_use, reconciliation logs and returns without inserting a tool_result.
  • The workflow worker and agent executor should both be running this version; mixed old/new workers are risky for this path.

Attempted Alternative and Limitation

We also tried a cleaner continuation shape: keep session service as the canonical owner of the approved user/tool_result row, seed Claude Code with history only through the original assistant tool_use, then send that final tool_result as the first resumed SDK stdin message instead of using a hidden Continue. tick.

That does not work correctly with the current Claude Code SDK/CLI behavior. In live cluster testing, the DB showed the original approved tool result persisted correctly, but Claude Code inserted a synthetic assistant row (<synthetic> / No response requested.) before the replayed stdin tool_result. That broke the clean assistant tool_use -> user tool_result adjacency and the model requested the same tool again, creating a second pending approval for the same user request.

TODO: revisit this if Claude Code exposes a supported “resume and consume pending tool_result” API, or if the SDK/CLI allows passing a tool result without inserting the synthetic assistant placeholder. Until then, this PR keeps the more reliable design: preseed the reconciled tool_result into session history and use a hidden metadata continuation tick, while filtering those control-plane rows from normal chat history.

Tests

  • uv run pytest tests/unit/test_agent_runtime.py tests/unit/test_agent_session_activities.py tests/unit/test_agent_session_approval_sink.py::test_replace_interrupt_with_tool_results_replaces_legacy_interrupted_row tests/temporal/test_durable_agent_workflow.py::test_agent_workflow_routes_approved_tools_to_executor_and_reconciles_history tests/temporal/test_durable_agent_workflow.py::test_agent_workflow_does_not_retry_approved_tool_failures -q
  • uv run ruff check packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py tests/integration/test_agent_worker.py tests/temporal/test_durable_agent_workflow.py tests/unit/test_agent_runtime.py tests/unit/test_agent_session_activities.py tests/unit/test_agent_session_approval_sink.py tests/unit/test_agent_session_messages.py tests/unit/test_agent_socket_io.py tests/unit/test_vercel_adapter.py tracecat/agent/adapter/vercel.py tracecat/agent/common/protocol.py tracecat/agent/executor/activity.py tracecat/agent/executor/schemas.py tracecat/agent/runtime/claude_code/runtime.py tracecat/agent/session/activities.py tracecat/agent/session/service.py
  • uv run ruff format --check packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py tests/integration/test_agent_worker.py tests/temporal/test_durable_agent_workflow.py tests/unit/test_agent_runtime.py tests/unit/test_agent_session_activities.py tests/unit/test_agent_session_approval_sink.py tests/unit/test_agent_session_messages.py tests/unit/test_agent_socket_io.py tests/unit/test_vercel_adapter.py tracecat/agent/adapter/vercel.py tracecat/agent/common/protocol.py tracecat/agent/executor/activity.py tracecat/agent/executor/schemas.py tracecat/agent/runtime/claude_code/runtime.py tracecat/agent/session/activities.py tracecat/agent/session/service.py
  • uv run basedpyright --warnings packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py tests/integration/test_agent_worker.py tests/temporal/test_durable_agent_workflow.py tests/unit/test_agent_runtime.py tests/unit/test_agent_session_activities.py tests/unit/test_agent_session_approval_sink.py tests/unit/test_agent_session_messages.py tests/unit/test_agent_socket_io.py tests/unit/test_vercel_adapter.py tracecat/agent/adapter/vercel.py tracecat/agent/common/protocol.py tracecat/agent/executor/activity.py tracecat/agent/executor/schemas.py tracecat/agent/runtime/claude_code/runtime.py tracecat/agent/session/activities.py tracecat/agent/session/service.py

@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 1, 2026 15:50 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 1, 2026 15:50 — with GitHub Actions Inactive
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be1d4381c1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/session/service.py Outdated
Comment thread tracecat/agent/runtime/claude_code/runtime.py Outdated
@blacksmith-sh

This comment has been minimized.

@zeropath-ai
Copy link
Copy Markdown

zeropath-ai Bot commented May 1, 2026

No security or compliance issues detected. Reviewed everything up to 70c4cfe.

Security Overview
Detected Code Changes
Change Type Relevant files
Enhancement ► tracecat_ee/tracecat_ee/agent/workflows/durable.py
    Reconcile SDK transcript with tool results
    Update executor input for resume with reconciled tool results
► tests/integration/test_agent_worker.py
    Test approval continuation resumes with seeded tool result
► tests/temporal/test_durable_agent_workflow.py
    Test approval continuation resumes with seeded tool result
    Test approval continuation forwards live thinking events
    Test approval continuation hides SDK meta prompt only
► tests/unit/test_agent_runtime.py
    Approval continuation sends meta prompt in each sandbox mode
► tests/unit/test_agent_session_activities.py
    Reconcile tool results removes interrupts and returns tool results
► tests/unit/test_agent_session_approval_sink.py
    Replace interrupt with tool results replaces legacy interrupted row
    Replace interrupt with tool results does not duplicate existing result
    Replace interrupt with tool results preserves interrupt without assistant uuid
    Replace interrupt with tool results requires same assistant turn
Refactor ► tests/unit/test_agent_runtime.py
    Update ClaudeSDKClient mock to include connect and disconnect
    Rename test_approval_continuation_uses_hidden_prompt_in_each_sandbox_mode to test_approval_continuation_sends_meta_prompt_in_each_sandbox_mode

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 14 files

Confidence score: 5/5

  • Low-risk change overall: the only reported issue is low severity (3/10) and appears confined to test coverage rather than runtime behavior.
  • The key concern is in tests/temporal/test_durable_agent_workflow.py, where dropping the failure payload assertion weakens retry validation and could let an error-result write regression slip through.
  • Pay close attention to tests/temporal/test_durable_agent_workflow.py - restore the failure payload assertion so the retry path still verifies the error result is persisted.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tests/temporal/test_durable_agent_workflow.py">

<violation number="1" location="tests/temporal/test_durable_agent_workflow.py:816">
P3: Restore the failure payload assertion here; otherwise the retry test no longer verifies the error result was written.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread tests/temporal/test_durable_agent_workflow.py Outdated
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 1, 2026 16:23 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 1, 2026 16:23 — with GitHub Actions Inactive
@daryllimyt daryllimyt force-pushed the feat/cleanup-approval-continuation branch from 380c6dd to d255055 Compare May 2, 2026 21:51
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 21:51 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 21:52 — with GitHub Actions Inactive
@blacksmith-sh

This comment has been minimized.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d255055467

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/session/service.py Outdated
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 22:47 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 22:47 — with GitHub Actions Inactive
@daryllimyt daryllimyt changed the title Avoid synthetic prompt for approval continuations refactor: Avoid synthetic prompt for approval continuations May 2, 2026
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 23:41 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 23:41 — with GitHub Actions Inactive
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e1994872c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/session/service.py
Comment thread tracecat/agent/executor/activity.py Outdated
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 23:49 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 2, 2026 23:49 — with GitHub Actions Inactive
@blacksmith-sh

This comment has been minimized.

@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 00:29 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 00:29 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 00:33 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 00:33 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 00:45 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 00:45 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 01:58 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 01:58 — with GitHub Actions Inactive
@blacksmith-sh

This comment has been minimized.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 44b691d305

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/session/service.py Outdated
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 02:18 — with GitHub Actions Inactive
@daryllimyt daryllimyt temporarily deployed to internal-registry-ci May 3, 2026 02:18 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor Author

daryllimyt commented May 3, 2026

@blacksmith-sh
Copy link
Copy Markdown
Contributor

blacksmith-sh Bot commented May 3, 2026

Found 7 test failures on Blacksmith runners:

Failures

Test View Logs
test_workflows/test_child_workflow_parallel_loop_with_dispatch_cap View Logs
test_workflows/test_scatter_with_child_workflow View Logs
test_workflows/test_workflow_join_run_if_skip_ok View Logs
test_workflows/test_workflow_scatter_gather[basic-for-loop] View Logs
test_workflows/test_workflow_scatter_gather[nested-for-loop] View Logs
test_workflows/test_workflow_scatter_gather[scatter-gather-with-surrounding-actions] View Logs
test_workflows/test_workflow_table_actions_in_loop View Logs

Fix in Cursor

@daryllimyt daryllimyt merged commit 879e829 into main May 3, 2026
20 of 22 checks passed
@daryllimyt daryllimyt deleted the feat/cleanup-approval-continuation branch May 3, 2026 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant