Conversation
Add {before,wrap,after,on_error}_{validate,execute}_output hooks to
AbstractCapability, enabling pre-parse normalization, error recovery,
and post-processing of model output across all output types.
Output hooks fire for text, structured text, and tool-based output.
For tool output, they fire inside the tool execution pipeline (tool
hooks are the outer layer, output hooks the inner layer).
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
|
||
|
|
||
| @dataclass | ||
| class OutputContext: |
There was a problem hiding this comment.
OutputContext is defined in pydantic_ai._output (a private module) but is used as a parameter type in the public AbstractCapability hook signatures. Users who want to subclass AbstractCapability and implement output hooks will need to import OutputContext, but they'd have to reach into pydantic_ai._output to do so.
It should be re-exported from pydantic_ai.capabilities (added to __init__.py and __all__) alongside the other public types like RawOutput, WrapOutputValidateHandler, etc.
There was a problem hiding this comment.
Yeah this belongs in the public pydantic_ai.output module
| self, | ||
| ctx: RunContext[AgentDepsT], | ||
| *, | ||
| input: RawOutput, |
There was a problem hiding this comment.
input shadows the Python builtin input(). The tool hooks use args for the input side and result for the output side — consider a name that's consistent with those, e.g. validated_output or just keep the parameter name output and rename the result to result (matching after_tool_execute's args/result pattern).
|
|
||
| return tool_result | ||
|
|
||
| async def _raw_execute_output_tool( |
There was a problem hiding this comment.
This method manually orchestrates the before/wrap/after/on_error hook pattern for both validate and execute phases (~70 lines), duplicating the logic in _run_output_validate_hooks and _run_output_execute_hooks in _output.py. The two paths can diverge silently (e.g. the error handling in _run_output_validate_hooks has allow_partial and wrap_validation_errors logic that this path skips).
Consider reusing those helpers here, or extracting a shared orchestration function that both call sites use. The tool-output-specific setup (identity do_validate, the do_execute that wraps processor.call + output validators) can be built up front and passed in.
| processor = toolset.processors[name] | ||
| output_context = processor.get_output_context('tool') | ||
| output_context.tool_call = validated.call | ||
| output_context.tool_def = validated.tool.tool_def |
There was a problem hiding this comment.
Mutating OutputContext after construction is a bit fragile — it means the dataclass fields have misleading defaults (None) that are only correct for text output, and tool-output callers must remember to fill them in. Consider making tool_call and tool_def constructor parameters (they're already known at this point), or adding a factory method like OutputContext.for_tool_output(...) that requires them.
| return output | ||
|
|
||
| else: | ||
| # UnionOutputProcessor and others: full process() in validate, identity execute. |
There was a problem hiding this comment.
The else branch handles "UnionOutputProcessor and others" but there's no exhaustive type check — a new BaseOutputProcessor subclass would silently fall into this branch. Consider using assert_never or at least an explicit isinstance(processor, UnionOutputProcessor) check with an else that raises, to match the project's convention of ending exhaustive chains with assert_never.
|
|
||
| Like tool processing, output processing has two phases: **validation** (parsing the model's raw output against the output schema) and **execution** (extracting the value and calling any output function). Each phase has its own hooks. | ||
|
|
||
| All output hooks receive an `output_context` parameter with [`OutputContext`][pydantic_ai._output.OutputContext] (mode, output type, schema info, and tool call details for tool output). |
There was a problem hiding this comment.
The docs link to [OutputContext][pydantic_ai._output.OutputContext] which points into a private module. If this type is re-exported from pydantic_ai.capabilities (per the other comment), update this link accordingly so it renders correctly in mkdocs and points to the public API reference.
| self, | ||
| ctx: RunContext[AgentDepsT], | ||
| *, | ||
| input: str | dict[str, Any], |
There was a problem hiding this comment.
Same input builtin shadowing issue as in abstract.py.
| self, | ||
| ctx: RunContext[AgentDepsT], | ||
| *, | ||
| input: str | dict[str, Any], |
There was a problem hiding this comment.
Same input builtin shadowing issue as in abstract.py.
|
@DouweM Minor note: the PR description says |
Docs Preview
|
- Use processor.process() instead of accessing private _str_argument_name and _function_schema in TextFunctionOutputProcessor handler - Add 8 new tests covering error paths: on_output_validate_error with retry, on_output_execute_error recovery, composed error hook chains, Hooks decorator API for error hooks, WrapperCapability delegation Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
| result_data = await _output.run_output_with_hooks( | ||
| text_processor, | ||
| text, | ||
| run_context=run_context, | ||
| capability=ctx.deps.root_capability, | ||
| output_mode=ctx.deps.output_schema.mode, | ||
| ) |
There was a problem hiding this comment.
📝 Info: output_mode in OutputContext reflects the schema mode, not the actual output format
In _handle_text_response at pydantic_ai_slim/pydantic_ai/_agent_graph.py:1171, output_mode=ctx.deps.output_schema.mode is passed. For a ToolOutputSchema that has a text_processor (hybrid mode), if the model returns text instead of a tool call, hooks receive OutputContext(mode='tool', ...) even though the actual output is text. This is a design choice — the mode represents the configured output schema, not the format of this particular response. Hook implementers should be aware that mode='tool' doesn't guarantee the output arrived via a tool call; they can check output_context.tool_call is None to distinguish.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Maybe output_context also needs an allow_text_output field, like we have on ModelRequestParameters? There's also image output to think about.
- Re-export OutputContext from pydantic_ai.capabilities - Rename `input` param to `validated_output` in after_output_execute to avoid shadowing builtin - Reuse _run_output_validate/execute_hooks in _tool_manager.py instead of duplicating orchestration logic - Use RawOutput alias consistently across combined/wrapper/hooks - Pass tool_call/tool_def as constructor args to OutputContext - Add explicit isinstance check for UnionOutputProcessor with assert_never fallback - Update docs links to point to public re-export Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Remove unnecessary # pyright: ignore[reportPrivateUsage] comment - Add tests for default error hook behavior (no override) - Add tests for edge cases: dict transform, streaming, no-capability fast path, Hooks decorator wrap_output_execute Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Rename _run_output_validate_hooks and _run_output_execute_hooks to drop the underscore prefix since they're intentionally used from _tool_manager.py. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Add tests exercising default error hooks (bare AbstractCapability), WrapperCapability error hook delegation, and Hooks class error chaining - Add pragma: no cover on ObjectOutputProcessor.process() error handling and OutputToolset.call_tool() which are now bypassed when capabilities are present (handled by run_output_with_hooks and _raw_execute_output_tool) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
RawOutput type alias was not imported in test file. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Mark streaming partial validation error wrapping (allow_partial branch) and rare ModelRetry/wrap_validation_errors=False paths with appropriate pragma comments. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This path is only reachable during streaming partial validation which is exercised through stream_output, not the direct test path. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Simplify bad_function test helper to always raise (remove unreachable return) - Replace unreachable error hook body with simpler test that verifies hooks fire without triggering errors Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ng path `execute_output_tool_call` was unconditionally wrapping `ModelRetry` from `processor.hook_execute` as `ToolRetryError`. With `wrap_validation_errors=False` (streaming, see `result.validate_response_output`), `ToolRetryError` isn't caught by the streaming handler, so the error propagated as an unhandled exception. Let `run_output_process_hooks` make the wrap/no-wrap decision based on the flag, matching how validator `ModelRetry` was already handled in the same closure.
…ols()` time Wraps the function and output toolsets in `Agent._get_toolset` with `PreparedToolset`s that dispatch the capability hook chain. The filtered/modified tool defs now flow into `ToolManager.tools` (so hallucinated calls to filtered tools fail with unknown-tool errors) and into the model's `ModelRequestParameters` simultaneously, instead of only the latter as before. Previously the dispatch happened in `_prepare_request_parameters` after `for_run_step` had already built `tool_manager.tools`, so the hook only changed what the model saw — it didn't affect tool execution lookup. Reported by Devin on PR #4859. The output-side dispatch closure overrides `ctx.max_retries` to the agent's `max_result_retries` to preserve the contract that `prepare_output_tools` sees the output retry budget (#4745). Adds a regression test asserting that a tool removed via `prepare_tools` is unreachable even when the model hallucinates a call to it.
…ute` `UnionOutputProcessor.hook_execute` checked `isinstance(semantic, inner.output_type)` to verify the validated value matched the resolved kind. For multi-arg output functions like `def f(x: int, y: str) -> Foo`, `output_type` is the first arg's type but `semantic` is the validated dict (no unwrap key), so the check always failed — the hook fell through to `_resolve_inner_for_value` which also failed, and the function was silently bypassed. Add a shape-aware match: when the inner is a multi-arg function, accept any dict (trust the validation kind); otherwise keep the isinstance check. `_resolve_inner_for_value` skips multi-arg inners on the type-mismatch fallthrough path, since their dict shape can't be picked out unambiguously from `output_type` alone. Reported by Devin on PR #4859.
| raise | ||
| raise error | ||
|
|
||
| # --- Output execute lifecycle hooks --- |
There was a problem hiding this comment.
The section comment says "Output execute lifecycle hooks" but the methods in this section are all *_output_process (before_output_process, after_output_process, wrap_output_process, on_output_process_error). Should be # --- Output process lifecycle hooks --- to match the method naming, consistent with the validate section above it (# --- Output validate lifecycle hooks ---).
Same issue in wrapper.py at the corresponding section.
…rage gaps `process()` was the legacy validate-then-call entry point on `BaseOutputProcessor` and its subclasses, replaced everywhere in production by `hook_validate`/`hook_execute` when output hooks landed. The two tests still calling it — covering only one branch each of `wrap_validation_errors=False` with invalid data — left the wrap-True and success branches uncovered. Remove the methods, update those tests to exercise `hook_validate` directly (the path that's actually hit in production). Also adds tests covering: - `PreparedToolset` with a synchronous prepare function (the no-await branch in `get_tools`). - The `@hooks.on.prepare_output_tools` decorator path (registration + dispatch). - `prepare_output_tools` filtering with a real `ToolOutput` so the hook actually fires (the prior tests used `output_type=str`, which has no output tools).
…h `*_output_process` method names Caught by github-actions[bot] review on PR #4859 — the methods in this section are all `before_output_process`/`after_output_process`/`wrap_output_process`/`on_output_process_error`, so the section should be "Output process lifecycle hooks", consistent with "Output validate lifecycle hooks" above. Already correct in `abstract.py`; fixing `combined.py` and `wrapper.py` to match.
prepare_tools/prepare_output_tools (BREAKING)
prepare_tools/prepare_output_tools (BREAKING)prepare_output_tools
…s additive Reverts the breaking-change part of the prepare-tools split so existing capabilities that override `prepare_tools` (released for ~2 weeks) keep getting the full tool set — function + output — as documented and shipped. `prepare_output_tools` is now a purely additive hook for output-tool-specific filtering with `ctx.max_retries` reflecting the output retry budget (the original motivation, see #4745). Implementation: - `PrepareTools` capability is back to overriding `get_wrapper_toolset` (matches main), so the agent-level `prepare_tools=` arg sees function tools only — unchanged from release. The `prepare_tools` capability hook itself now dispatches via a `PreparedToolset` wrap on the **combined** toolset, so it sees both function and output tool defs and the result still flows into `ToolManager.tools`. - `PrepareOutputTools` keeps its `prepare_output_tools` hook implementation; the hook is dispatched via a `PreparedToolset` wrap on the output toolset specifically (with `ctx.max_retries` overridden to the output budget). `Agent(prepare_output_tools=...)` injects a `PrepareOutputTools` capability mirroring `prepare_tools=`. Order: `prepare_output_tools` runs first (innermost — only sees output tools), then `prepare_tools` sees the merged list (outermost — fires after output prep). Also updates the `prepare_tools` / `prepare_output_tools` docstrings and the `docs/capabilities.md` / `docs/hooks.md` sections to reflect the additive shape, and flips the `TestPrepareToolsHook` snapshot test to assert that the hook sees both function and output tool kinds.
Per `tests/AGENTS.md` rule, agent/model/stream tests should snapshot the message history alongside the final-output assertion. Adds `assert result.all_messages() == snapshot(...)` to the output validate/process error-recovery and retry-flow tests in `TestOutputHookErrorPaths` and `TestDefaultOutputErrorHooks` so future regressions in retry-prompt content, tool-call IDs, message ordering, etc. get caught instead of being masked by tests that only check `result.output`.
| # `output_toolset.max_retries` is set to `max_result_retries` at agent construction. | ||
| output_cap = run_capability | ||
| output_max_retries = self._max_result_retries | ||
|
|
There was a problem hiding this comment.
It seems significant that we changed the order of agent.prepare_tools <> capability.get_wrapper_toolset? Should we change that back?
| 'loc': (), | ||
| 'msg': 'Invalid JSON: expected ident at line 1 column 2', | ||
| 'input': 'not json', | ||
| 'ctx': {'error': 'expected ident at line 1 column 2'}, |
There was a problem hiding this comment.
This suggests to me that maybe we should've been using include_context=False on ValidationError.errors, because this is verbose + duplicative.
…via PreparedToolset `prepare_tools` capability hook now receives function tools only and filters at the registry level (via `PreparedToolset`), matching what the agent-level `prepare_tools=` kwarg has always done. Output tools route exclusively to `prepare_output_tools`. The kwarg now flows through the same hook path the capability uses (`PrepareTools.prepare_tools`), unifying both entry points. This is technically a behavior change — the released capability hook saw all tools (function + output) and was visibility-only — but framing it as a bug fix: the kwarg/hook split was inconsistent on two axes (which tools, registry vs visibility), and we already split tool execute hooks to skip output tools post-release. Closes #5241 (no v2 follow-up needed). Also: - Order: `prepare_tools` wraps **inside** other capability `get_wrapper_toolset` results (e.g. `ToolSearch`, `CodeMode`), preserving main's kwarg ordering — toolset transformations layer on top of prepared defs. - Pass `include_context=False` to `ValidationError.errors()` for output retry prompts, matching tool retries — drops the duplicative `ctx` field from `RetryPromptPart` content.
prepare_output_toolsprepare_tools to function tools, add prepare_output_tools
…ests Output retry prompts now use `include_context=False`, matching tool retries — drop the now-stale `ctx` entries from existing provider snapshots.
…Tools`/`PrepareOutputTools` `PreparedToolset.get_tools` already rejects added/renamed tools and normalizes `None` to an empty list. The capability hooks just need to normalize sync/async `prepare_func` calls — collapse the helper to that and let the toolset wrap handle validation.
After the output hooks PR (#4859), output tools no longer go through wrap_tool_execute, so the outer execute_tool span that wrapped output tool calls was lost. Restore it via Instrumentation.wrap_output_process, which fires for every output processing pass; emit only when output_context .tool_call is set so non-tool output (text/native/prompted/image) is unchanged. Output-function spans stay inline in execute_output_function — they nest inside the new wrap_output_process span when both are present, matching the historical two-level structure (outer tool, inner function).
…ols` to function tools, add `prepare_output_tools` (pydantic#4859) Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
…antic#4859 + tracker pydantic#5238 DouweM said 'OK by me' on Devin's r3076333465 (2026-04-15) but then merged the opposite call in pydantic#4859 (2026-04-28): output validators see the GLOBAL output-retry budget, not the per-tool `ToolOutput(max_retries=N)`. The reasoning is documented in the merged code at `tool_manager.py::ToolManager.execute_output_tool_call`: the same validator stays consistent across the text path and across multiple `ToolOutput`s. The per-tool case where `ToolOutput(max_retries=N)` exceeds the global budget is tracked in pydantic#5238 and intentionally out of scope. This revert restores `OutputToolset.call_tool` to use `self.max_retries` (the global budget) for the validator context, drops the two regression tests that asserted per-tool semantics (`test_validator_sees_per_tool_max_retries`, `test_validator_max_retries_text_path_unchanged`), and reverts the snapshots in `test_output_validator_retry_consistency_across_paths` and `test_output_validator_retry_counter_with_tool_switch` to their pre-fix values. The rest of PR pydantic#5075 (rename, run-method `output_retries`, override() support, tool-retries docs, _agent_graph.py:1021 missed rename) is unchanged.
Summary
Output hooks (new)
before_output_validate/wrap_output_validate/after_output_validate/on_output_validate_errorandbefore_output_process/wrap_output_process/after_output_process/on_output_process_errorhooks onAbstractCapability(composed inCombinedCapability, delegated inWrapperCapability, available via@hooks.on.*decorators onHooks). The validate pair fires for raw input shape (text or tool args); the process pair fires for the semantic value (MyModel(...),42, the dict the multi-arg output function will receive, etc.).output_validatorcallbacks now run inside the process pair, so one capability can wrap built-in + user-defined validation in one place.OutputContextexposesmode,output_type,object_def,has_function,tool_call,tool_defso hooks know what kind of output they're handling.prepare_tools/prepare_output_tools(breaking change, framed as bug fix)The released
prepare_toolshad a split-personality bug: theAgent(prepare_tools=...)kwarg saw function tools only and filtered at the registry level, while theAbstractCapability.prepare_toolshook saw all tools (function + output) and only filtered visibility. This PR normalizes both onto one path:prepare_toolscapability hook → now receives function tools only, filters at the registry level (viaPreparedToolset). Mirrors what the kwarg has always done.prepare_output_toolscapability hook → receives only [output tools][pydantic_ai.output.ToolOutput], withctx.retry/ctx.max_retriesreflecting the output retry budget (max_result_retries) — fixing the missing context info that was the original motivation for fix: propagateAgent(retries=...)to user-provided toolsets #4745.Agent(prepare_tools=...)and newAgent(prepare_output_tools=...)kwargs injectPrepareTools/PrepareOutputToolscapabilities that flow through the same hook path the user-defined capability uses.prepare_toolswraps inside other capabilityget_wrapper_toolsetresults (e.g.ToolSearch,CodeMode), preserving main's kwarg ordering — toolset transformations layer on top of prepared defs.This subsumes #5241 (was filed as a v2 follow-up).
Other behavior fixes (observable through public API)
prepare_tools-filtered tool now hits "unknown tool" instead of silently executing it (filtering is registry-level, not just visibility).PromptedOutput([combine_func, OtherType])) now actually run; previously the union dispatch isinstance-checked the validated dict against the function's first-arg type and silently bypassed the function.ModelRetryraised by an output function duringrun_stream()now surfaces asUnexpectedModelBehaviorcaused-byModelRetry(caught by the streaming validator), instead of escaping unhandled wrapped inToolRetryError.include_context=FalsetoValidationError.errors()(matching tool retries), removing the duplicativectxfield fromRetryPromptPartcontent.Cleanup
process()method onBaseOutputProcessorand its subclasses; production paths usehook_validate+hook_executeexclusively.Docs
docs/capabilities.mdanddocs/hooks.mdupdated with output hook docs and the unifiedprepare_tools/prepare_output_toolsmodel.Test plan
on_output_validate_error, pre-parse transformation viabefore_output_validate, hook decorator API, the newprepare_output_toolshook + constructor arg, hallucinated-call blocking, multi-arg union dispatch, and the streaming-ModelRetryregression. Retry-flow tests now snapshotresult.all_messages()pertests/AGENTS.mdguidance.Closes #5111
Closes #5241
🤖 Generated with Claude Code