Overview

SWE-bench Lite provides a smaller, carefully selected subset of 300 tasks from the full benchmark, designed to:

  • Reduce evaluation costs while maintaining benchmark quality
  • Enable faster iteration cycles for model development
  • Provide a more accessible entry point for research groups

The 300 tasks were selected to preserve the distribution and difficulty spectrum of the original benchmark while focusing on more self-contained, functional bug fixes.

While the full SWE-bench test split comprises 2,294 issue-commit pairs across 12 Python repositories, SWE-bench Lite covers 11 of the original 12 repositories with a similar diversity and distribution. We also provide 23 development instances that can be useful for active development on the SWE-bench task.

We recommend future systems evaluating on SWE-bench to report numbers on SWE-bench Lite in lieu of the full SWE-bench set when compute efficiency is a concern.

Selection Criteria

SWE-bench Lite instances were selected using the following criteria:

  • Removed instances with images, external hyperlinks, references to specific commit SHAs and references to other pull requests or issues
  • Removed instances with fewer than 40 words in the problem statement
  • Removed instances that edit more than 1 file
  • Removed instances where the gold patch has more than 3 edit hunks
  • Removed instances that create or remove files
  • Removed instances that contain tests with error message checks
  • Finally, sampled 300 test instances and 23 development instances from the remaining candidates

The source code for how SWE-bench Lite was created is available in SWE-bench/swebench/collect/make_lite.

Repository Distribution

SWE-bench Lite distribution across repositories. Compare to the full SWE-bench in Figure 3 of the SWE-bench paper.

SWE-bench Lite repository distribution

Baseline Performance

SWE-bench Lite performance for our baselines. Compare to the full SWE-bench baseline performance in Table 5 of the SWE-bench paper.

SWE-bench Lite baseline performance comparison

Resources

SWE-bench Lite datasets:

Citation

If you use SWE-bench in your research, please cite our paper:

@inproceedings{
    jimenez2024swebench,
    title={{SWE}-bench: Can Language Models Resolve Real-world Github Issues?},
    author={Carlos E Jimenez and John Yang and Alexander Wettig and Shunyu Yao and Kexin Pei and Ofir Press and Karthik R Narasimhan},
    booktitle={The Twelfth International Conference on Learning Representations},
    year={2024},
    url={https://openreview.net/forum?id=VTF8yNQM66}
}