-
Notifications
You must be signed in to change notification settings - Fork 2
fix(tools): wire is_reasoning_model() and error_phase into agent tool execution path #2357
Copy link
Copy link
Closed
Labels
P2High value, medium complexityHigh value, medium complexitybugSomething isn't workingSomething isn't working
Description
Summary
PR #2354 added scaffolding for reasoning model hallucination detection and tool invocation phase taxonomy, but the helpers are not wired into the agent execution path:
is_reasoning_model()— implemented inanomaly.rs, exported fromzeph-tools, but never called fromzeph-coreor the agent looprecord_reasoning_quality_failure()onAnomalyDetector— implemented but never called from production codeerror_phase: Option<String>inAuditEntry— field exists but is alwaysNonein every code path;ToolInvocationPhase::phase()is never used to populate itreasoning_model_warningflag inOrchestrationConfig— declared, defaulttrue, but never read from the agent loop
Expected behavior
- When the active provider model name matches
is_reasoning_model()and a tool call results inToolNotFound/InvalidParameters/TypeMismatch,record_reasoning_quality_failure()should be called on the anomaly detector AuditEntry.error_phaseshould be populated fromToolErrorCategory::phase()when an error entry is writtenreasoning_model_warningflag should gate whetherrecord_reasoning_quality_failureemits the WARN log
Reproduction
Run live agent session. Inspect audit-test.jsonl — every entry has "error_phase":null. Grep production code: is_reasoning_model and record_reasoning_quality_failure have zero callers outside zeph-tools.
Files to change
crates/zeph-core/src/agent/— add reasoning model check when processing tool resultscrates/zeph-tools/src/shell/mod.rs,scrape.rs— populateerror_phasefromToolErrorCategory::phase()inlog_auditcalls
Related: #2284 (research issue not auto-closed despite PR #2354 "Closes #2284")
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P2High value, medium complexityHigh value, medium complexitybugSomething isn't workingSomething isn't working