Skip to content
@SWE-bench

SWE-bench

Organization for maintaining the SWE-bench/agent projects

SWE-bench

This organization contains the source code for SWE-bench, a benchmark for evaluating AI systems on real world GitHub issues.

Use the repositories in this organization to...

Also check out related organizations

  • SWE-bench-repos: Mirror clones for repositories used for SWE-bench style evalautions.
  • SWE-agent: Solve GitHub issue(s) automatically with a Language Model powered agent!

Pinned Loading

  1. SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2.8k 469

  2. experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 163 167

  3. sb-cli Public

    Run SWE-bench evaluations remotely

    Python 9

  4. swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    HTML 4 5

Repositories

Showing 6 of 6 repositories
  • swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    HTML 4 5 1 1 Updated Apr 1, 2025
  • experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 163 165 5 9 Updated Apr 1, 2025
  • SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2,752 MIT 469 33 6 Updated Mar 29, 2025
  • sb-cli Public

    Run SWE-bench evaluations remotely

    Python 9 MIT 0 3 0 Updated Mar 8, 2025
  • .github Public
    0 0 0 0 Updated Feb 26, 2025
  • humanevalfix-results Public

    Evaluation data + results for SWE-agent inference on HumanEvalFix task

    Jupyter Notebook 0 0 0 0 Updated Jul 11, 2024