Skip to content

SWE-bench-Live/submission

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Submissions & Experiments

This repo hosts model results, trajectories, and evaluation logs on SWE-bench-Live. We coordinate result submissions via Pull Requests.

Trajectories & Logs

We provided the trajectories from the experiments conducted in the paper, see this link. For third-party submitted trajectories and logs, please refer to the corresponding submission directory in this repository. We temporarily host these files directly on GitHub and recommend using sparse checkout to only checkout the directory contents you care about.

Submitting Instructions

Thank you for your interest in submitting results to SWE-bench-Live. We are currently following the submission process outlined below.

  1. Clone a fork of the repository, consider using git clone --depth 1 --filter=blob:none --sparse to speed up the process.

  2. In the folder corresponding to your evaluated subset (submissions/{subset}), create a new folder named in the format: YYYYMMDD-{YOUR_METHOD_NAME} E.g. 20250501-sweagent-claude37.

  3. Place your predictions file in preds.json, which should include the patch for each instance. Place the evaluation report generated by the SWE-bench-Live evaluation script in results.json.

  4. Optionally, create a logs folder to store logs from the evaluation process, and a trajs folder to store reasoning trajectories that reflect how your system solved the problems.

  5. Create a README to explain the agent scaffold you used and the experimental setting, including the number of rollouts, how results were sampled, the number of iterations, and other relevant details.

  6. Create a pull request to the SWE-bench-Live/submissions repository with the new submission folder.

Contacting

For any issues encountered during the submission process, please open an issue in the repository.

About

Submit your results on SWE-bench-Live

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages