Skip to content

fix: v0.34.3 — tool-aware stall threshold, edit failure recovery, diagnostics#106

Merged
Nathan Schram (nathanschram) merged 6 commits intomasterfrom
fix/v0.34.3
Mar 8, 2026
Merged

fix: v0.34.3 — tool-aware stall threshold, edit failure recovery, diagnostics#106
Nathan Schram (nathanschram) merged 6 commits intomasterfrom
fix/v0.34.3

Conversation

@nathanschram
Copy link
Copy Markdown
Member

Summary

  • Tool-aware stall threshold: three-tier system — normal (5 min), running tool (10 min), pending approval (30 min) — prevents false stall warnings during long-running operations #105
  • Progress message edit failure recovery: fall back to sending a new message when the "queued" → "starting" edit fails #103
  • Approval keyboard edit failure logging: wait=True for keyboard transitions, info/warning-level logging for diagnostics #104
  • /usage 429 handling: downgrade rate limit errors from error to warning level #89
  • Session cleanup structured reporting: registry cleanup logged at info level #93
  • Comprehensive doc updates: MCP-automated integration testing, engine list updates, test count sync, unexpected engine behaviour detection

Fixes #89, #103, #104, #105

Test plan

  • uv run pytest — 1547 passed, 80.90% coverage
  • uv run ruff check src/ tests/ — all checks passed
  • uv run ruff format --check src/ tests/ — 234 files formatted
  • uv lock --check — lockfile in sync
  • CI: format, ruff, ty, pytest (3.12/3.13/3.14), build, lockfile, install-test, pip-audit, bandit, docs
  • Integration testing against @untether_dev_bot (post-merge)

🤖 Generated with Claude Code

…gnostics (#103, #104, #105, #89)

- Tool-aware stall threshold: 10 min when a tool is actively running (#105)
- Progress edit failure fallback: log + send when initial edit fails (#103)
- Approval keyboard: wait=True for keyboard transitions, failure logging (#104)
- /usage 429: downgrade from error to warning level (#89)
- Session cleanup structured reporting, spawn args logging, no-events warning
- 300+ lines of new tests, integration testing playbook

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…, and guides

Integration tests are now automated via Telegram MCP tools (send_message,
get_history, list_inline_buttons, press_inline_button, reply_to_message).
Updated all relevant docs to reflect this workflow.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- T1 (voice): send_voice MCP tool with OGG/Opus file
- T5 (media groups): send_file MCP tool for rapid file sends
- B4 (SIGTERM): Bash tool kill -TERM
- B5 (log inspection): Bash tool journalctl + FD/zombie checks
- Add post-test log inspection and GitHub issue creation instructions
- Add structured test result tracking (pass/fail/error with reason)
- Distinguish Untether bugs from upstream engine API errors

Co-Authored-By: Claude Opus 4.6 <[email protected]>
All tiers are fully automatable — no manual steps remain.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Watch for phantom responses (output from empty input), session
cross-contamination, wrong engine, disproportionate cost. Discovered
via AMP producing substantive DNS content from empty voice transcription.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- CLAUDE.md: update test counts (1472→1548), add test_build_args.py
  and test_loop_coverage.py, add send_voice/send_file to MCP list
- AGENTS.md: Gemini CLI and Amp are shipped, not "coming soon"
- CONTRIBUTING.md: add gemini/amp to architecture diagram and prereqs
- hooks.json: add MCP automation note to version bump checklist
- testing-conventions.md: add new test files to key test table
- integration-testing.md: add T5 send_file limitation note

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@nathanschram Nathan Schram (nathanschram) merged commit feeaac6 into master Mar 8, 2026
17 checks passed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f9fe13259

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

resume=resume.value if resume else None,
prompt=prompt[:100] + "…" if len(prompt) > 100 else prompt,
prompt_len=len(prompt),
args=cmd[1:],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove full CLI args from runner start logs

The new runner.start log field args=cmd[1:] records the complete subprocess argument list, which includes raw user prompt text for engines that send prompts via CLI args (for example Claude’s -p <prompt> flow), so normal info-level logs now persist potentially sensitive chat/code content that was previously only logged in truncated form. This is a production privacy regression because every run can leak full prompt data into log sinks.

Useful? React with 👍 / 👎.

Comment on lines +723 to +725
if not action_state.completed:
return True
break # only check the most recent
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Check all running actions before applying tool stall threshold

_has_running_tool() is documented as checking whether any action is still running, but the unconditional break means it only inspects the newest inserted action. When actions overlap (e.g. Claude can emit multiple tool starts before all tool results arrive), a newer action can complete while an older tool is still running; this function then returns False, causing stall logic to drop back to the shorter normal threshold and emit incorrect stuck warnings (and eventually auto-cancel) during legitimate long-running tool work.

Useful? React with 👍 / 👎.

@nathanschram Nathan Schram (nathanschram) deleted the fix/v0.34.3 branch March 10, 2026 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/usage 429 rate limit logged at error level with full stack trace

1 participant