Skip to content

docs: add multi tool agent failure modes troubleshooting note#15

Merged
sunnynexus merged 2 commits intoRUC-NLPIR:mainfrom
onestardao:main
Mar 9, 2026
Merged

docs: add multi tool agent failure modes troubleshooting note#15
sunnynexus merged 2 commits intoRUC-NLPIR:mainfrom
onestardao:main

Conversation

@onestardao
Copy link
Copy Markdown
Contributor

Hi,

This PR adds the small docs-only troubleshooting note we discussed in #14.

Summary

  • Adds docs/multi_tool_agent_failure_modes.md, a short troubleshooting page focused on multi-tool agent failure modes when running ToolBench, API-Bank, ToolHop, GAIA, HLE and other benchmarks.
  • Links the new page from the README under a short “Troubleshooting multi tool failures” section near the evaluation flags.

The note is intentionally compact and DeepAgent-specific:

  • Starts with a one-screen quick checklist to run when the agent looks stuck or keeps calling strange tools.
  • Breaks down typical failure patterns (wrong tool, argument mismatch, environment/config mismatch, tool outputs ignored, etc.) into “symptom → likely cause → minimal checks”.
  • Keeps all examples aligned with the existing configuration flags and scripts in this repo.

Scope

  • Docs only, no code or config changes.
  • Does not affect any training or evaluation scripts.

Testing

  • Rendered both the README and docs/multi_tool_agent_failure_modes.md in GitHub’s preview to check formatting and links.

Closes #14.

Thanks for reviewing!

Point users to the new multi tool failure modes checklist from the main README.
@sunnynexus sunnynexus merged commit 93c44b5 into RUC-NLPIR:main Mar 9, 2026
@sunnynexus
Copy link
Copy Markdown
Member

Thank you for your effort! We have merged this commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: add a small “multi-tool agent failure modes” troubleshooting note (docs only)

2 participants