SWE-agent

stale

Princeton NLP's benchmark-driven software engineering agent. Academic origin, strong SWE-bench Verified results (79.2%). Last release v1.1.0 was 2025-05-22 — 10 months stale as of 2026-03-19. Best treated as a research reference, not a production tool.

Score 79stale

Where it wins

Benchmark-native story

Clear issue-solving shape

Strong technical credibility

Where to be skeptical

Narrower than OpenHands for broad factory workflows

Less about continuous loops than Ralph

Editorial verdict

Research/academic reference only. Princeton pedigree and 79.2% SWE-bench Verified on Opus 4.5 scaffold give it strong benchmark credibility. But no release in 10 months (last: v1.1.0, 2025-05-22) puts it outside the production cadence of all active tools. Use as a benchmark scaffold reference, not as a production coding CLI.

Source

GitHub: SWE-agent/SWE-agent

Docs: swe-agent.com

Found via SkillPack? ★ Star us on GitHub

Coding CLIs / Code Agents

#14of 22

Benchmark research, academic reference, issue-level repair evaluation

Teams of Agents / Multi-Agent Orchestration

#09of 23

Issue-level repair with strong academic benchmark credibility

Claude Code

Anthropic's official agentic coding CLI. v2.1.81 (Mar 20) shipped `--bare`, smarter worktree resume, and improved MCP OAuth while the repo crossed 82,204 stars and logged ~14 commits/week across 10+ maintainers. Terminal-native, tool-use-driven, with deep file system + shell access, #1 SWE-bench Pro standardized (45.89%), ~4% of GitHub public commits (SemiAnalysis), $2.5B annualized revenue. 8M+ npm weekly downloads. Opus 4.6 with 1M context.

LangGraph

#1 Python agent framework by production evidence — 40.2M PyPI downloads/month, Fortune 500 deployments (LinkedIn, Uber, Replit, Elastic, Klarna, Cloudflare, Coinbase), ~400 LangGraph Platform companies, LangSmith rated best-in-class observability. Stable v1.x API, model-agnostic, MCP support.

Pydantic AI

#3 Python agent framework by downloads — 15.6M PyPI/month. Built by the Pydantic team. Runtime type enforcement is a genuine differentiator no other framework offers. V1 shipped with Temporal integration for durable execution and Logfire observability. Emerging pattern: 'Pydantic AI for agent logic, LangGraph for orchestration' (ZenML).

AutoGen (Microsoft)

⚠️ MAINTENANCE MODE — Microsoft officially confirmed bug fixes and security patches only, no new features (VentureBeat 2026-02-19). 55.9K stars but only 1.57M PyPI/month — DL/star ratio of 28, the most inflated among active frameworks. Being replaced by Microsoft Agent Framework (AutoGen + Semantic Kernel merge, GA targeted ~Q2 2026). Teams on AutoGen should plan migration.

Public evidence

strong2026-03

Live-SWE-agent: Claude Opus 4.5 scores 79.2% on SWE-bench Verified — leads all open-source scaffolds

SWE-agent scaffold achieves 79.2% on SWE-bench Verified with Opus 4.5, leading all open-source scaffolds. Also created mini-swe-agent (100 lines, >74% SWE-bench).

Public leaderboard, NeurIPS 2024 paperPrinceton University researchers (John Yang, Carlos Jimenez, Kilian Lieret, Ofir Press)

moderateSelf-reported2026-02

mini-swe-agent: 100-line agent scores >74% SWE-bench Verified

Radically simplified version — 100 lines, no huge configs. Matches full SWE-agent performance. Shows the scaffold can be minimal.

GitHub repo, successor to full SWE-agentSWE-agent team (Princeton)

Raw GitHub source

GitHub README peek

Constrained peek so you can sanity-check the source material without leaving the site.

[!warning] Most of our current development effort is on mini-swe-agent, which has superseded SWE-agent. It matches the performance performance of SWE-agent, while being much simpler. See the FAQ for more details about the differences. Our general recommendation is to use mini-SWE-agent instead of SWE-agent going forward.

SWE-agent enables your language model of choice (e.g. GPT-4o or Claude Sonnet 4) to autonomously use tools to fix issues in real GitHub repositories, find cybersecurity vulnerabilities, or perform any custom task.

✅ State of the art on SWE-bench among open-source projects
✅ Free-flowing & generalizable: Leaves maximal agency to the LM
✅ Configurable & fully documented: Governed by a single yaml file
✅ Made for research: Simple & hackable by design

SWE-agent is built and maintained by researchers from Princeton University and Stanford University.

📣 News

July 24: Mini-SWE-Agent achieves 65% on SWE-bench verified in 100 lines of python!
May 2: SWE-agent-LM-32b achieves open-weights SOTA on SWE-bench
Feb 28: SWE-agent 1.0 + Claude 3.7 is SoTA on SWE-Bench full
Feb 25: SWE-agent 1.0 + Claude 3.7 is SoTA on SWE-bench verified
Feb 13: Releasing SWE-agent 1.0: SoTA on SWE-bench light & tons of new features
Dec 7: An interview with the SWE-agent & SWE-bench team

🚀 Get started!

Read our documentation to learn more:

Installation
Hello world from the command line
Benchmarking on SWE-bench
Frequently Asked Questions

SWE-agent for offensive cybersecurity (EnIGMA) <a name="enigma"></a>

SWE-agent: EnIGMA is a mode for solving offensive cybersecurity (capture the flag) challenges. EnIGMA achieves state-of-the-art results on multiple cybersecurity benchmarks (see leaderboard). Please use SWE-agent 0.7 while we update EnIGMA for 1.0.

In addition, you might be interested in our other projects:

Contributions <a name="contributions"></a>

If you'd like to contribute to the codebase, we welcome issues and pull requests! For larger code changes, we always encourage discussion in issues first.

Citation & contact <a name="citation"></a>

SWE-agent is an academic project started at Princeton University by John Yang*, Carlos E. Jimenez*, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. Contact person: John Yang, Carlos E. Jimenez, and Kilian Lieret (Email: johnby@stanford.edu, carlosej@cs.princeton.edu, kl5675@princeton.edu).

If you found this work helpful, please consider citing it using the following:

<details> <summary> SWE-agent citation</summary>

@inproceedings{yang2024sweagent,
  title={{SWE}-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
  author={John Yang and Carlos E Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik R Narasimhan and Ofir Press},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
  url={https://arxiv.org/abs/2405.15793}
}

</details>

If you used the summarizer, interactive commands or the offensive cybersecurity capabilities in SWE-agent, please also consider citing:

<details> <summary>EnIGMA citation</summary>

@misc{abramovich2024enigmaenhancedinteractivegenerative,
      title={EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges},
      author={Talor Abramovich and Meet Udeshi and Minghao Shao and Kilian Lieret and Haoran Xi and Kimberly Milner and Sofija Jancheska and John Yang and Carlos E. Jimenez and Farshad Khorrami and Prashanth Krishnamurthy and Brendan Dolan-Gavitt and Muhammad Shafique and Karthik Narasimhan and Ramesh Karri and Ofir Press},
      year={2024},
      eprint={2409.16165},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2409.16165},
}

</details>

🪪 License <a name="license"></a>

MIT. Check LICENSE.

View on GitHub →