Skip to content

Adding per tool usage limit#3691

Closed
adtyavrdhn wants to merge 125 commits intopydantic:mainfrom
adtyavrdhn:per_tool_usage_limit
Closed

Adding per tool usage limit#3691
adtyavrdhn wants to merge 125 commits intopydantic:mainfrom
adtyavrdhn:per_tool_usage_limit

Conversation

@adtyavrdhn
Copy link
Copy Markdown
Member

@adtyavrdhn adtyavrdhn commented Dec 10, 2025

Closes #3352

Soft Tool Usage Limits

Introduces soft tool usage limits via ToolPolicy and ToolsPolicy. Unlike the hard UsageLimits.tool_calls_limit (raises UsageLimitExceeded), soft limits return a message to the model so it can adapt.

APIs

ToolPolicy - Per-Tool Limits

Set on individual tools via @agent.tool(usage_policy=...):

from pydantic_ai import Agent, ToolPolicy

agent = Agent('openai:gpt-4o')

@agent.tool_plain(usage_policy=ToolPolicy(max_uses=3, max_uses_per_step=1))
def fetch_record(record_id: int) -> str:
    return f'Record {record_id}'

ToolsPolicy - Agent-Wide Limits

Set on Agent via tools_policy or per-run:

from pydantic_ai import Agent, ToolPolicy, ToolsPolicy

agent = Agent('openai:gpt-4o', tools_policy=ToolsPolicy(max_uses=5, max_uses_per_step=2))

# Per-run override with per-tool runtime overrides
result = agent.run_sync(
    '...',
    tools_policy=ToolsPolicy(
        max_uses=10,
        per_tool={'expensive_api': ToolPolicy(max_uses=3)}
    )
)

Parameters

Parameter ToolPolicy ToolsPolicy Description
max_uses Max successful calls (total run for ToolPolicy, aggregate across all tools for ToolsPolicy)
max_uses_per_step Max successful calls per step
partial_acceptance Accept partial batch or reject all (default: True)
per_tool - Runtime per-tool overrides

Behavior

  • When a per-tool max_uses is reached, the tool is removed from available tools
  • When aggregate limits are hit, model receives a message to produce output without tools
  • partial_acceptance=True (default): If model requests 5 calls but only 3 are allowed, 3 accepted, 2 rejected
  • partial_acceptance=False: All-or-nothing — entire batch rejected if it would exceed limits

Comparison to UsageLimits

Parameter Behavior Use Case
UsageLimits.tool_calls_limit Raises UsageLimitExceeded Hard stop to prevent runaway costs
ToolPolicy.max_uses Removes tool / returns message Limit specific expensive/rate-limited tools
ToolsPolicy.max_uses Returns message to model Soft aggregate limit; model adapts

Not Included

Toolset-level handling (e.g., MCPServer(usage_policy=...) applying limits across all MCP tools as a group) is not part of this PR. Can take it up later? I am unsure of the value addition it will bring apart from what I've already added.

@adtyavrdhn adtyavrdhn changed the title Per tool usage limit Adding per tool usage limit Dec 11, 2025
Comment thread docs/agents.md Outdated
Comment thread docs/agents.md Outdated
Comment thread docs/agents.md Outdated
Comment thread docs/agents.md Outdated
Comment thread docs/tools-advanced.md Outdated
Comment thread pydantic_ai_slim/pydantic_ai/_run_context.py Outdated
Comment thread pydantic_ai_slim/pydantic_ai/_tool_manager.py Outdated
Comment thread pydantic_ai_slim/pydantic_ai/_tool_manager.py Outdated
Comment thread pydantic_ai_slim/pydantic_ai/_tool_manager.py Outdated
Comment thread pydantic_ai_slim/pydantic_ai/_tool_manager.py
Comment thread docs/tools-advanced.md Outdated
Comment thread pydantic_ai_slim/pydantic_ai/_tool_manager.py Outdated
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 26, 2026
…tic#3691

Analysis addresses Douwe's request about feature interaction.

Context: PR pydantic#3691 by @adtyavrdhn (1 month of work) introduces ToolPolicy
for tool usage limits with configurable behaviors. Our ToolFailed work
addresses tool failure handling from issue pydantic#2586.

Key findings:
- ToolPolicy and ToolFailed address different but related concerns
- Both can coexist: ToolPolicy = declarative, ToolFailed = explicit
- ToolFailed can adapt to fit ToolPolicy's mode-based design
- PR pydantic#3691 has priority - we defer to team on integration approach

Integration options:
1. ToolFailed maps to ToolPolicy modes (recommended if both proceed)
2. Wait for pydantic#3691, implement via its on_error modes
3. Parallel features, unify later based on usage

Suggests discussion points for team sync respecting pydantic#3691's timeline
and existing design work.
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 26, 2026
Add section noting @adtyavrdhn's ToolPolicy work and how features relate.
Defer to team sync on integration approach - respecting their timeline
and existing design work.
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 26, 2026
Quick reference doc outlining:
- What we built (ToolFailed implementation)
- How it relates to PR pydantic#3691
- Options for moving forward
- Recommendation to defer to team sync

Not submitting PR yet - waiting for coordination clarity.
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View issues and 7 additional flags in Devin Review.

Open in Devin Review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MCP tool prefix handling changes ctx.tool_name to unprefixed, breaking ctx.tool_use/tools_use_counts consistency for prefixed MCP tools

When MCPServer.tool_prefix is set, MCPServer.call_tool() strips the prefix and replaces ctx.tool_name with the unprefixed name.

Actual behavior: the framework’s successful-use counters (RunContext.tools_use_counts) are keyed by the tool name used in the ToolCallPart (the prefixed name), but the RunContext seen by process_tool_call is mutated to unprefixed tool_name. As a result, ctx.tool_use (and direct lookups in ctx.tools_use_counts) will read the wrong key and typically show 0 even after prior successful calls.

Expected behavior: either the tool-use counters should use the same key as ctx.tool_name presented to callbacks, or ctx.tool_name should remain consistent with the tool name used for accounting.

Impact: any custom process_tool_call logic (or future MCP tooling relying on ctx.tool_use) will behave incorrectly when a prefix is configured—e.g. it may believe a tool has never been used and repeatedly call it.

Click to expand
  • MCPServer.call_tool() strips prefix and rewrites ctx.tool_name:
    • pydantic_ai_slim/pydantic_ai/mcp.py:570-572
if self.tool_prefix:
    name = name.removeprefix(f'{self.tool_prefix}_')
    ctx = replace(ctx, tool_name=name)
  • Successful-use counting is keyed off ToolCallPart.tool_name (the prefixed name) elsewhere (see ToolManager._call_tool() increment at pydantic_ai_slim/pydantic_ai/_tool_manager.py:199-201).

(Refers to lines 570-577)

Recommendation: Keep ctx.tool_name consistent with the key used for tools_use_counts (likely the prefixed tool name). If you need to pass the unprefixed name to the MCP server, do so via a separate local variable (e.g. server_tool_name) without rewriting ctx.tool_name, or update counting to use the unprefixed name everywhere for MCP tools.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines 196 to +201
call.args or {}, allow_partial=pyd_allow_partial, context=ctx.validation_context
)

return await self.toolset.call_tool(name, args_dict, ctx, tool)
result = await self.toolset.call_tool(name, args_dict, ctx, tool)
self.ctx.tools_use_counts[name] = self.ctx.tools_use_counts.get(name, 0) + 1
return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Output tools incorrectly increment RunContext.tools_use_counts despite comment that output tools are not counted

ToolManager.handle_call() explicitly treats output tools as “not traced and not counted”, but _call_tool() increments self.ctx.tools_use_counts[name] for all tool kinds, including output tools.

Actual behavior: successful output-tool executions increase RunContext.tools_use_counts, so ctx.tools_use_counts / ctx.tool_use can include output-tool invocations even though RunUsage.tool_calls does not.

Expected behavior: output tools should not affect the new “successful tool use” counters (or the comment/semantics should be updated consistently).

Impact: users relying on RunContext.tools_use_counts (and examples/documentation encourage this) may see inflated totals and may write incorrect logic (e.g. gating behavior or auditing tool usage) because output-tool calls are mixed into the same counter.

Click to expand
  • Output tools are routed through _call_tool() via handle_call():
    • pydantic_ai_slim/pydantic_ai/_tool_manager.py:130-139
  • _call_tool() increments tools_use_counts unconditionally:
    • pydantic_ai_slim/pydantic_ai/_tool_manager.py:199-201
# handle_call
if (tool := self.tools.get(call.tool_name)) and tool.tool_def.kind == 'output':
    # Output tool calls are not traced and not counted
    return await self._call_tool(...)

# _call_tool
result = await self.toolset.call_tool(...)
self.ctx.tools_use_counts[name] = self.ctx.tools_use_counts.get(name, 0) + 1

(Refers to lines 130-201)

Recommendation: Skip incrementing tools_use_counts for output tools (e.g. guard in handle_call() or inside _call_tool() based on tool.tool_def.kind). Alternatively, if output tools should be counted, update comments/docs and ensure all related counters (RunUsage.tool_calls, usage limits) use the same definition.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will read more on this but great find if correct.

rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 27, 2026
…tic#3691

Analysis addresses Douwe's request about feature interaction.

Context: PR pydantic#3691 by @adtyavrdhn (1 month of work) introduces ToolPolicy
for tool usage limits with configurable behaviors. Our ToolFailed work
addresses tool failure handling from issue pydantic#2586.

Key findings:
- ToolPolicy and ToolFailed address different but related concerns
- Both can coexist: ToolPolicy = declarative, ToolFailed = explicit
- ToolFailed can adapt to fit ToolPolicy's mode-based design
- PR pydantic#3691 has priority - we defer to team on integration approach

Integration options:
1. ToolFailed maps to ToolPolicy modes (recommended if both proceed)
2. Wait for pydantic#3691, implement via its on_error modes
3. Parallel features, unify later based on usage

Suggests discussion points for team sync respecting pydantic#3691's timeline
and existing design work.
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 27, 2026
Add section noting @adtyavrdhn's ToolPolicy work and how features relate.
Defer to team sync on integration approach - respecting their timeline
and existing design work.
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 27, 2026
Quick reference doc outlining:
- What we built (ToolFailed implementation)
- How it relates to PR pydantic#3691
- Options for moving forward
- Recommendation to defer to team sync

Not submitting PR yet - waiting for coordination clarity.
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 27, 2026
Introduces ToolFailed exception to allow tools to fail without terminating
the agent run. This is especially useful for parallel tool execution where
partial failures should not stop the entire batch.

Key features:
- Errors are traced in telemetry (unlike ModelRetry)
- Agent continues execution (unlike arbitrary exceptions)

Three exception modes:

- ModelRetry: Expected retry behavior, not an error
- ToolFailed(disable=False): System error that should be logged/monitored
- ToolFailed(disable=True): Permanent failure, disable tool

This can coexist with ToolPolicy from pydantic#3691. The main difference here is
that ToolFailed is explicit inline in the user's tools (in Python)
whereas ToolPolicy is declarative. I'll need to do more thinking on how
to have `ToolFailed` map to `ToolPolicy` modes.
@rgbkrk rgbkrk mentioned this pull request Jan 29, 2026
6 tasks
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 29, 2026
Introduces ToolFailed exception to allow tools to fail without terminating
the agent run. This is especially useful for parallel tool execution where
partial failures should not stop the entire batch.

Key features:
- Errors are traced in telemetry (unlike ModelRetry)
- Agent continues execution (unlike arbitrary exceptions)

Three exception modes:

- ModelRetry: Expected retry behavior, not an error
- ToolFailed(disable=False): System error that should be logged/monitored
- ToolFailed(disable=True): Permanent failure, disable tool

This can coexist with ToolPolicy from pydantic#3691. The main difference here is
that ToolFailed is explicit inline in the user's tools (in Python)
whereas ToolPolicy is declarative. I'll need to do more thinking on how
to have `ToolFailed` map to `ToolPolicy` modes.
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 29, 2026
Introduces ToolFailed exception to allow tools to fail without terminating
the agent run. This is especially useful for parallel tool execution where
partial failures should not stop the entire batch.

Key features:
- Errors are traced in telemetry (unlike ModelRetry)
- Agent continues execution (unlike arbitrary exceptions)

Three exception modes:

- ModelRetry: Expected retry behavior, not an error
- ToolFailed(disable=False): System error that should be logged/monitored
- ToolFailed(disable=True): Permanent failure, disable tool

This can coexist with ToolPolicy from pydantic#3691. The main difference here is
that ToolFailed is explicit inline in the user's tools (in Python)
whereas ToolPolicy is declarative. I'll need to do more thinking on how
to have `ToolFailed` map to `ToolPolicy` modes.
rgbkrk added a commit to rgbkrk/pydantic-ai that referenced this pull request Jan 29, 2026
Introduces ToolFailed exception to allow tools to fail without terminating
the agent run. This is especially useful for parallel tool execution where
partial failures should not stop the entire batch.

Key features:
- Errors are traced in telemetry (unlike ModelRetry)
- Agent continues execution (unlike arbitrary exceptions)

Three exception modes:

- ModelRetry: Expected retry behavior, not an error
- ToolFailed(disable=False): System error that should be logged/monitored
- ToolFailed(disable=True): Permanent failure, disable tool

This can coexist with ToolPolicy from pydantic#3691. The main difference here is
that ToolFailed is explicit inline in the user's tools (in Python)
whereas ToolPolicy is declarative. I'll need to do more thinking on how
to have `ToolFailed` map to `ToolPolicy` modes.
@adtyavrdhn
Copy link
Copy Markdown
Member Author

/gemini summary

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

This pull request introduces a flexible system for managing tool usage within the agent framework, moving beyond strict error-raising limits to a more adaptive approach. By providing granular control over individual tools and aggregate agent-level policies, it empowers developers to guide model behavior when resource constraints are met, rather than abruptly halting execution. This enhancement allows for more sophisticated and resilient agent designs, particularly for managing expensive or rate-limited external services.

Highlights

  • Soft Tool Usage Limits: Introduced ToolPolicy and ToolsPolicy to enable soft limits on tool usage. Unlike hard UsageLimits.tool_calls_limit which raises an error, soft limits return a message to the model, allowing it to adapt gracefully.
  • Per-Tool and Agent-Wide Policies: ToolPolicy can be set on individual tools via @agent.tool(usage_policy=...) for per-tool limits, while ToolsPolicy (represented by tool_use_policy on the Agent) allows for agent-wide aggregate limits. Both support max_uses (total successful calls) and max_uses_per_step (successful calls per model turn).
  • Partial Execution Control: A partial_execution parameter (defaulting to True) allows configuring whether a batch of tool calls should be partially accepted (allowed calls execute, exceeding calls are rejected) or entirely rejected if limits are exceeded. This applies to both per-tool and agent-wide policies.
  • Programmatic Usage Control: The RunContext now includes tools_use_counts (a dictionary of successful calls per tool) and a tool_use property (successful calls for the current tool), enabling tools to dynamically adjust their behavior based on their usage history.
  • Unified Policy Handling: The ToolManager has been refactored to centralize the logic for evaluating and enforcing these new usage policies, ensuring consistent behavior across different toolsets and agent configurations.
Changelog
  • docs/agents.md
    • Added a new section 'Soft Tool Use Limits' to explain the concept and usage of ToolPolicy.
    • Included a Python example demonstrating ToolPolicy(max_uses=2) and its effect on model output.
    • Added a comparison table between tool_calls_limit and ToolPolicy.max_uses.
  • docs/builtin-tools.md
    • Updated comments for max_uses in WebFetchTool and WebSearchTool to explicitly state 'successful uses'.
  • docs/tools-advanced.md
    • Expanded the 'Limiting tool executions' note to differentiate between hard UsageLimits and soft ToolPolicy.
    • Added a comprehensive 'Soft Tool Usage Limits' section detailing ToolPolicy options (max_uses, max_uses_per_step, partial_execution), partial vs. all-or-nothing execution, and programmatic usage control via RunContext.
    • Included Python examples for per-tool limits, batch operations with partial_execution=False, programmatic usage checks, and a comprehensive example showcasing multiple features.
  • docs/tools.md
    • Updated the link description for 'Advanced Tool Features' to include 'usage limits'.
  • pydantic_ai_slim/pydantic_ai/init.py
    • Exported the new ToolPolicy class.
  • pydantic_ai_slim/pydantic_ai/_agent_graph.py
    • Imported collections.Counter for tracking tool usage.
    • Introduced _handle_tool_calls_parts function to process tool calls, applying batch and per-call rejection logic based on ToolPolicy and ToolsPolicy.
    • Modified _call_tools to use _handle_tool_calls_parts for pre-processing tool calls and filtering out rejected calls before execution.
  • pydantic_ai_slim/pydantic_ai/_output.py
    • Added usage_policy=None to ToolsetTool initialization in get_tools.
  • pydantic_ai_slim/pydantic_ai/_run_context.py
    • Added tools_use_counts: dict[str, int] to RunContext to track successful tool calls per tool.
    • Added a tool_use property to RunContext to get the successful use count for the current tool.
  • pydantic_ai_slim/pydantic_ai/_tool_manager.py
    • Imported ToolPolicy, ToolPolicyMode, and UsageLimitExceeded.
    • Added tools_use_policy: ToolPolicy | None to ToolManager to store agent-level policy.
    • Updated for_run_step to copy tools_use_counts to the new RunContext and pass tools_use_policy.
    • Modified tool_defs property to filter out tools that have reached their max_uses limit.
    • Implemented _get_current_uses_of_tool to retrieve current tool usage counts.
    • Implemented _reject_call to handle tool call rejections, raising UsageLimitExceeded if mode='error' or returning a message for model_retry.
    • Added get_batch_rejection_reason to check agent-level batch limits with partial_execution=False.
    • Added get_tool_call_rejection_reason to enforce per-tool and agent-level limits incrementally, supporting partial execution.
    • Incremented self.ctx.tools_use_counts[name] after a successful tool call in _call_tool.
  • pydantic_ai_slim/pydantic_ai/_tool_usage_policy.py
    • Added new file defining ToolPolicyMode (Literal['error', 'model_retry']) and the ToolPolicy dataclass.
    • The ToolPolicy class includes max_uses, max_uses_per_step, partial_execution, and mode.
  • pydantic_ai_slim/pydantic_ai/agent/init.py
    • Imported ToolPolicy.
    • Added _tool_use_policy: ToolPolicy | None as a private field to the Agent class.
    • Added tool_use_policy parameter to Agent.__init__ for agent-level policy configuration.
    • Added tool_use_policy property to Agent.
    • Added tool_use_policy parameter to Agent.iter and its overloads, allowing runtime override of the agent's tool usage policy.
    • Added usage_policy parameter to @agent.tool and @agent.tool_plain decorators for per-tool policy configuration.
  • pydantic_ai_slim/pydantic_ai/agent/abstract.py
    • Imported ToolPolicy.
    • Added tool_use_policy: ToolPolicy | None parameter to AbstractAgent.run, run_sync, run_stream, run_stream_sync, and run_stream_events methods and their overloads.
  • pydantic_ai_slim/pydantic_ai/agent/wrapper.py
    • Imported ToolPolicy.
    • Added tool_use_policy: ToolPolicy | None parameter to WrapperAgent.iter, run, run_sync, run_stream, and run_stream_events methods and their overloads.
  • pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py
    • Imported ToolPolicy.
    • Added tool_use_policy: ToolPolicy | None parameter to wrapped_run_workflow and wrapped_run_sync_workflow.
  • pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py
    • Imported ToolPolicy.
    • Added tool_use_policy: ToolPolicy | None parameter to run, run_sync, run_stream, and run_stream_events methods and their overloads.
  • pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py
    • Imported ToolPolicy.
    • Added tool_use_policy: ToolPolicy | None parameter to run, run_sync, run_stream, and run_stream_events methods and their overloads.
  • pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_dynamic_toolset.py
    • Imported ToolPolicy.
    • Added usage_policy: ToolPolicy | None to _ToolInfo dataclass.
    • Updated get_tools_activity to include usage_policy when creating _ToolInfo.
    • Updated _tool_for_tool_info to pass tool_info.usage_policy to ToolsetTool.
  • pydantic_ai_slim/pydantic_ai/mcp.py
    • Imported ToolPolicy.
    • Added usage_policy: ToolPolicy | None to MCPServer class and its __init__ method.
    • Updated tool_for_tool_def to pass self.usage_policy to ToolsetTool.
  • pydantic_ai_slim/pydantic_ai/tools.py
    • Imported ToolPolicy.
    • Added usage_policy: ToolPolicy | None to Tool class and ToolDefinition.
  • pydantic_ai_slim/pydantic_ai/toolsets/abstract.py
    • Imported ToolPolicy.
    • Added usage_policy: ToolPolicy | None = None to ToolsetTool.
  • pydantic_ai_slim/pydantic_ai/toolsets/combined.py
    • Propagated tool.usage_policy when creating ToolsetTool in get_tools.
  • pydantic_ai_slim/pydantic_ai/toolsets/external.py
    • Added usage_policy=None to ToolsetTool initialization in get_tools.
  • pydantic_ai_slim/pydantic_ai/toolsets/fastmcp.py
    • Imported ToolPolicy.
    • Added usage_policy: ToolPolicy | None to FastMCPToolset class and its __init__ method.
    • Updated tool_for_tool_def to pass self.usage_policy to ToolsetTool.
  • pydantic_ai_slim/pydantic_ai/toolsets/function.py
    • Imported ToolPolicy.
    • Added usage_policy: ToolPolicy | None to FunctionToolset class and its __init__ method.
    • Added usage_policy parameter to tool decorator and add_function method.
    • Updated add_tool to apply toolset's usage_policy if not set on the tool.
    • Updated get_tools to pass the resolved usage_policy to ToolsetTool.
  • tests/test_examples.py
    • Added new test cases for 'Please call the tool three times' and 'Calculate something' to demonstrate ToolPolicy behavior.
    • Updated model_logic to handle Tool use limit reached messages for do_work and calculate tools.
  • tests/test_tools.py
    • Imported ToolPolicy and UsageLimitExceeded.
    • Added test_tool_max_uses to verify ToolPolicy.max_uses and max_uses_per_step with partial execution.
    • Added test_tool_policy_partial_execution_false_per_tool_rejection to test per-tool partial_execution=False behavior.
    • Added test_tool_policy_max_uses_incremental_limit to test max_uses with incremental calls.
    • Added test_tool_policy_per_tool_mode_error to verify mode='error' raises UsageLimitExceeded.
    • Added test_tool_partial_execution_independent_of_agent to show tool's partial_execution works even if agent's is False.
    • Added test_agent_partial_execution_false_rejects_all_on_agent_limit to confirm agent-level partial_execution=False rejects all calls when agent limit is hit.
Activity
  • The PR was marked as stale by github-actions[bot] on 2026-01-03.
  • The author, adtyavrdhn, pinged the PR on 2026-01-03.
  • vikigenius inquired about the PR's status on 2026-01-12, noting test failures that didn't make sense.
  • DouweM committed to reviewing the PR within a few days on 2026-01-12.
  • A bot (devin-ai-integration[bot]) raised two issues on 2026-01-27: one regarding inconsistent ctx.tool_name handling for prefixed MCP tools, and another about output tools incorrectly incrementing RunContext.tools_use_counts.
  • The author, adtyavrdhn, acknowledged the bot's findings on 2026-01-27, stating they would investigate further.
  • Several review comments from DouweM (2025-12-16) and adtyavrdhn (2025-12-22, 2025-12-23, 2026-01-19, 2026-01-23, 2026-01-25, 2026-01-26) discuss the design of ToolPolicy, consistency of 'calls' vs 'uses', handling of partial execution, and the unification of tool-related arguments and policies.

@adtyavrdhn adtyavrdhn marked this pull request as draft April 8, 2026 15:42
dsfaccini added a commit to dsfaccini/pydantic-ai that referenced this pull request Apr 13, 2026
…ctx.max_retries` on tool path

Preparatory refactor of the output-retry machinery with three independently-motivated but tightly-coupled changes:

- Rename confusing private/internal retry fields so the mental model is obvious from code:
  `Agent._max_result_retries` -> `_max_output_retries`;
  `GraphAgentDeps.max_result_retries` -> `max_output_retries`;
  `GraphAgentState.retries` -> `output_retries_used`;
  `GraphAgentState.increment_retries` -> `consume_output_retry`;
  `OutputToolset._output_max_retries` removed. Error message
  `Exceeded maximum retries (N) for output validation` -> `Exceeded maximum output retries (N)`.
- Add `output_retries` kwarg to `run` / `run_sync` / `run_stream` / `run_stream_sync` / `run_stream_events` / `iter`
  with precedence `run arg > spec > agent default`. Plumbs through WrapperAgent and all three
  durable_exec wrappers (dbos/prefect/temporal). Runtime override clones the shared output toolset
  before mutating `max_retries` so concurrent runs don't race.
- Fix the Devin review comment on pydantic#4956: `ctx.max_retries` in an output validator on the tool path
  now reflects `tool.max_retries` (the per-tool enforcement limit that will actually stop the validator)
  instead of the agent-level global default. Text path is unchanged.

Also documents the two-path enforcement model in `docs/agent.md` with a new
"How output retries are enforced" subsection.

Intentionally out of scope: `tool_retries` parameter (blocked by ToolUsePolicy design in pydantic#3691),
`Agent.override()` extension (reachable via `spec=` today), and deprecating `retries`.
@adtyavrdhn adtyavrdhn closed this Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature request, or PR implementing a feature (enhancement) size: L Large PR (501-1500 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add per-tool usage limits

4 participants