Skip to content

Insufficient static string template for ToolCallSummaryMessage #6426

@ChrisBlaa

Description

@ChrisBlaa

What happened?

The Problem
I have the following use case:
A tool call with huge payload if successful. Model must not reflect on it.
If tool call is successful, I just need have a tool call summary message like "Tool Foo has been executed successfully", whereas I need all details for the failure case (i.e.{result}). This was not possible because of the static string template we pass to the assistant agent.

Proposal
We could make it more dynamic by injecting a tool_call_summary_format_fct which will be evaluated at runtime and gives more control over the summary messages.

Implementation (_assistant_agent.py)

@staticmethod
    def _summarize_tool_use(
        executed_calls_and_results: List[Tuple[FunctionCall, FunctionExecutionResult]],
        inner_messages: List[BaseAgentEvent | BaseChatMessage],
        handoffs: Dict[str, HandoffBase],
        tool_call_summary_format: str,
        tool_call_summary_format_fct: Callable[[FunctionCall, FunctionExecutionResult], str] | None,
        agent_name: str,
    ) -> Response:
  ...

        def default_summary_template(call: FunctionCall, result: FunctionExecutionResult) -> str:
            return tool_call_summary_format

        summary_template_fct = tool_call_summary_format_fct or default_summary_template

        tool_call_summaries = [
            summary_template_fct(call, result).format(
                tool_name=call.name,
                arguments=call.arguments,
                result=result.content,
                is_error=result.is_error,
            )
            for call, result in normal_tool_calls
        ]
...

Thats basically it, it could be easily integrated without any breaking changes, I already did everything required.

How to use it (test_assistant_agent.py)

async def _throw_function(input: str) -> str:
    raise ValueError("Helpful debugging information what went wrong.")

@pytest.fixture
def model_info_all_capabilities() -> ModelInfo:
    return {
        "function_calling": True,
        "vision": True,
        "json_output": True,
        "family": ModelFamily.GPT_4O,
        "structured_output": True,
    }


@pytest.mark.asyncio
async def test_run_with_tool_call_summary_format_function(model_info_all_capabilities: ModelInfo) -> None:
    model_client = ReplayChatCompletionClient(
        [
            CreateResult(
                finish_reason="function_calls",
                content=[
                    FunctionCall(id="1", arguments=json.dumps({"input": "task"}), name="_pass_function"),
                    FunctionCall(id="2", arguments=json.dumps({"input": "task"}), name="_throw_function"),
                ],
                usage=RequestUsage(prompt_tokens=10, completion_tokens=5),
                thought="Calling pass and fail function",
                cached=False,
            ),
            "pass and fail",
            "TERMINATE",
        ],
        model_info=model_info_all_capabilities,
    )

    def conditional_string_templates(function_call: FunctionCall, function_call_result: FunctionExecutionResult) -> str:
        if not function_call_result.is_error:
            return "SUCCESS: {tool_name} with {arguments}"

        else:
            return "FAILURE: {result}"

    agent = AssistantAgent(
        "tool_use_agent",
        model_client=model_client,
        tools=[_pass_function, _throw_function],
        tool_call_summary_format_fct=conditional_string_templates,
    )
    result = await agent.run(task="task")

    first_tool_call_summary = next((x for x in result.messages if isinstance(x, ToolCallSummaryMessage)), None)
    if first_tool_call_summary is None:
        raise AssertionError("Expected a ToolCallSummaryMessage but found none.")

    assert (
        first_tool_call_summary.content
        == 'SUCCESS: _pass_function with {"input": "task"}\nFAILURE: Helpful debugging information what went wrong.'
    )

Let me know if its considered helpful, if so I can provide the PR @ekzhu

Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

AutoGen library version.

Python dev (main branch)

Other library version.

No response

Model used

No response

Model provider

None

Other model provider

No response

Python version

None

.NET version

None

Operating system

None

Metadata

Metadata

Assignees

No one assigned

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions