Skip to content

feat: add async evaluation and run artifacts#4

Merged
MukundaKatta merged 1 commit intomainfrom
codex/agentbench-async-and-artifacts
Apr 20, 2026
Merged

feat: add async evaluation and run artifacts#4
MukundaKatta merged 1 commit intomainfrom
codex/agentbench-async-and-artifacts

Conversation

@MukundaKatta
Copy link
Copy Markdown
Owner

Summary:

  • add async evaluation support for agent callables without breaking sync usage
  • add async agent comparison support
  • define a structured run artifact bundle with a stable run id and leaderboard entry
  • export benchmark bundles and leaderboard summaries to disk
  • document async usage and artifact export in the README

Closes #2
Closes #3

Testing:

  • python3 -m py_compile src/agentbench/core.py src/agentbench/init.py tests/test_core.py
  • PYTHONPATH=/tmp/AgentBench-fix/src python3 -m pytest /tmp/AgentBench-fix/tests

@MukundaKatta MukundaKatta merged commit 37fde6c into main Apr 20, 2026
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add result dataset and leaderboard artifact model for benchmark runs Add async agent evaluation support

1 participant