feat(tools): add structured tool error taxonomy with retry engine#2214
Merged
feat(tools): add structured tool error taxonomy with retry engine#2214
Conversation
8d07f03 to
498aa07
Compare
, #2199) Implement 12-category ToolErrorCategory enum in zeph-tools based on arXiv:2601.16280, with category-specific retry/fallback strategies and guaranteed tool_result delivery for every tool_call_id regardless of failure mode (arXiv:2509.25370). Key changes: - Add ToolErrorCategory enum (11 variants: ToolNotFound, SchemaInvalid, InvalidParameters, TypeMismatch, PolicyBlocked, ConfirmationRequired, PermanentFailure, Cancelled, RateLimited, ServerError, NetworkError, Timeout) - Add ToolErrorFeedback struct with structured category, message, and suggestion - Add ToolError::Http { status, message } variant for fine-grained HTTP classification - Add [tools.retry] config section: max_attempts, base_ms, max_ms, budget_secs, parameter_reformat_provider (follows *_provider multi-model convention) - Wire RetryConfig to runtime: session_config reads config.tools.retry.*, retry_backoff_ms() parameterized with base_ms/max_ms from config - Gate attempt_self_reflection on is_quality_failure() — infrastructure errors (network, server, rate-limit) no longer trigger self-reflection - Config migration: migrate old [agent].max_tool_retries to [tools.retry] - --init wizard: new steps for retry_max_attempts and parameter_reformat_provider - TUI spinner: emit "Retrying tool: <name> (attempt N/M)..." status during retries Fixes: B2 (io::NotFound → PermanentFailure, not ToolNotFound) B4 (self-reflection gate with regression test) HTTP 400/422 → InvalidParameters classification
498aa07 to
513ce8e
Compare
This was referenced Mar 27, 2026
bug-ops
added a commit
that referenced
this pull request
Mar 27, 2026
, #2206) Complements #2214 (ToolErrorCategory + ToolErrorFeedback) with two gaps that remained after that PR: **Shell executor exit-code classification** ShellExecutor now returns `ToolError::Shell { exit_code, category, message }` for well-known failure modes instead of embedding errors in the output string: - exit 126 (permission denied) → ToolErrorCategory::PolicyBlocked - exit 127 (command not found) → ToolErrorCategory::PermanentFailure - stderr containing "permission denied" → PolicyBlocked - stderr containing "no such file or directory" → PermanentFailure These errors flow through the existing `ToolErrorFeedback.format_for_llm()` injection in native.rs, producing structured `[tool_error]` blocks for the LLM. **Skill evolution FailureKind integration** `From<ToolErrorCategory> for FailureKind` maps the taxonomy to skill evolution signals: PolicyBlocked/ConfirmationRequired/ToolNotFound → WrongApproach, InvalidParameters/TypeMismatch → SyntaxError, Timeout → Timeout, infrastructure errors (RateLimited, ServerError, NetworkError, PermanentFailure, Cancelled) → Unknown. Eliminates string heuristic for classified tool failures. Closes #2207, closes #2206.
bug-ops
added a commit
that referenced
this pull request
Mar 27, 2026
, #2206) Complements #2214 (ToolErrorCategory + ToolErrorFeedback) with two gaps that remained after that PR: **Shell executor exit-code classification** ShellExecutor now returns `ToolError::Shell { exit_code, category, message }` for well-known failure modes instead of embedding errors in the output string: - exit 126 (permission denied) → ToolErrorCategory::PolicyBlocked - exit 127 (command not found) → ToolErrorCategory::PermanentFailure - stderr containing "permission denied" → PolicyBlocked - stderr containing "no such file or directory" → PermanentFailure These errors flow through the existing `ToolErrorFeedback.format_for_llm()` injection in native.rs, producing structured `[tool_error]` blocks for the LLM. **Skill evolution FailureKind integration** `From<ToolErrorCategory> for FailureKind` maps the taxonomy to skill evolution signals: PolicyBlocked/ConfirmationRequired/ToolNotFound → WrongApproach, InvalidParameters/TypeMismatch → SyntaxError, Timeout → Timeout, infrastructure errors (RateLimited, ServerError, NetworkError, PermanentFailure, Cancelled) → Unknown. Eliminates string heuristic for classified tool failures. Closes #2207, closes #2206.
bug-ops
added a commit
that referenced
this pull request
Mar 27, 2026
, #2206) (#2226) Complements #2214 (ToolErrorCategory + ToolErrorFeedback) with two gaps that remained after that PR: **Shell executor exit-code classification** ShellExecutor now returns `ToolError::Shell { exit_code, category, message }` for well-known failure modes instead of embedding errors in the output string: - exit 126 (permission denied) → ToolErrorCategory::PolicyBlocked - exit 127 (command not found) → ToolErrorCategory::PermanentFailure - stderr containing "permission denied" → PolicyBlocked - stderr containing "no such file or directory" → PermanentFailure These errors flow through the existing `ToolErrorFeedback.format_for_llm()` injection in native.rs, producing structured `[tool_error]` blocks for the LLM. **Skill evolution FailureKind integration** `From<ToolErrorCategory> for FailureKind` maps the taxonomy to skill evolution signals: PolicyBlocked/ConfirmationRequired/ToolNotFound → WrongApproach, InvalidParameters/TypeMismatch → SyntaxError, Timeout → Timeout, infrastructure errors (RateLimited, ServerError, NetworkError, PermanentFailure, Cancelled) → Unknown. Eliminates string heuristic for classified tool failures. Closes #2207, closes #2206.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ToolErrorCategoryenum inzeph-tools(arXiv:2601.16280) with category-specific retry/fallback strategiestool_call_idalways receives a correspondingtool_result, even on failure (arXiv:2509.25370)attempt_self_reflectionon quality failures only — infrastructure errors (network, server, rate-limit) no longer trigger it[tools.retry]config section wired to runtime with config migration from old[agent].max_tool_retriesCloses #2203, #2199
Changes
crates/zeph-tools/src/error_taxonomy.rs(new) —ToolErrorCategory,ToolErrorFeedback,classify_http_status,classify_io_errorcrates/zeph-tools/src/executor.rs—ToolError::Httpvariant, retry engine integrationcrates/zeph-tools/src/config.rs—RetryConfigstruct with[tools.retry]sectioncrates/zeph-core/src/agent/session_config.rs— readsconfig.tools.retry.*(wasconfig.agent.max_tool_retries)crates/zeph-core/src/agent/tool_orchestrator.rs—retry_base_ms/retry_max_msfields,with_retry_backoff()buildercrates/zeph-core/src/agent/tool_execution/native.rs— structured[tool_error]feedback,is_quality_failure()gatecrates/zeph-config/src/migrate.rs—migrate_agent_retry_to_tools_retry()migration stepsrc/init.rs—--initwizard steps for new retry config fieldsconfig/default.toml—[tools.retry]defaults documentedTest plan
cargo nextest run --workspace --lib --bins)cargo clippy --workspace -- -D warningscleancargo +nightly fmt --checkcleanInvalidParametersregression test presentio::ErrorKind::NotFound→PermanentFailure(notToolNotFound) regression test presentattempt_self_reflectionregression test present[agent].max_tool_retries→[tools.retry].max_attempts