Devin (Cognition)

active

Pioneered the async autonomous coding agent category. $10.2B valuation, ~$900M total funding, $150M+ ARR (incl. Windsurf). Enterprise customers: Goldman Sachs, Santander, Nubank. Independent eval (Answer.AI): 15% success on complex tasks.

Score 41

Where it wins

$10.2B valuation, ~$900M total funding — highest-funded in category

$150M+ combined ARR (incl. Windsurf acquisition)

Enterprise customers: Goldman Sachs, Santander, Nubank

67% PR merge rate on well-defined tasks

530 HN pts on launch + 502 pts on Windsurf acquisition — high engagement

Where to be skeptical

No public benchmark scores — red flag for transparency

Answer.AI independent eval: 15% success rate (3/20 tasks) — only rigorous independent test

HN sentiment predominantly skeptical ('overhyped,' 'demo-driven')

Windsurf acquisition caused customer exodus (pricing changes, trust erosion)

ACU pricing model is opaque and potentially expensive

96% price cut ($500→$20) signals massive competitive pressure

Editorial verdict

Highest-funded pure-play, but the gap between self-reported (67% merge rate) and independent results (15% success) is the defining data point. Business metrics are strong; product evidence on complex tasks is weak.

Source

Found via SkillPack? ★ Star us on GitHub

Videos

Reviews, tutorials, and comparisons from the community.

Devin AI Explained for Beginners (AI Coding Assistant for Software Engineers)

The Cutting Edge School·2025-07-29

Devin 2.0: First-Ever AI Software Engineer IS TRULY INSANE! (Devin IDE, CLI Coder, & More!)

WorldofAI·2025-08-11

I Tried Devin (AI Software Engineer) — Full Review vs Cursor & Copilot

Cloud Champ·2025-07-02

Software Factories

#10of 18

Teams wanting a fully sandboxed autonomous agent — Windsurf acquisition could change the picture

Claude Code

Anthropic's official agentic coding CLI. v2.1.81 (Mar 20) shipped `--bare`, smarter worktree resume, and improved MCP OAuth while the repo crossed 82,204 stars and logged ~14 commits/week across 10+ maintainers. Terminal-native, tool-use-driven, with deep file system + shell access, #1 SWE-bench Pro standardized (45.89%), ~4% of GitHub public commits (SemiAnalysis), $2.5B annualized revenue. 8M+ npm weekly downloads. Opus 4.6 with 1M context.

OpenHands

Category leader in multi-agent orchestration — 69,352 stars (verified), $18.8M Series A, AMD hardware partnership, 455 contributors, 1M downloads/month PyPI (3.4M all-time). SWE-Bench Verified 72% with Claude 4.5 Extended Thinking (updated 2026-03-19), Multi-SWE-Bench #1 across 8 languages. Gap to #2 is enormous on every axis.

Gemini CLI

Google's open-source terminal agent with Gemini 3 models, 1M token context, built-in Google Search grounding, and the best free tier in the category (60 req/min, 1K req/day). v0.35.0 (Mar 24) shipped keybinding, policy, and telemetry fixes while the repo hit 98,957 stars and 12,593 forks. Terminal-Bench 2.0: 78.4% (#1). SWE-bench Pro standardized 43.30% (#3). Plan Mode added March 2026. First-pass correctness ~50-60% (Educative.io).

Codex CLI

OpenAI's open-source coding agent built in Rust. Terminal-Bench 77.3% (#2), SWE-bench Pro standardized 41.04% (GPT-5.2-Codex). GPT-5.4 shipped March 5, 2026. Codex Security agent adds appsec capabilities. 3-4x more token-efficient than Claude Code, 240+ tokens/sec. Free with ChatGPT subscription, sandbox-first execution. 1M+ first-month users. Cleanest security record in Tier 1 — no documented incidents.

Public evidence

strong2025-01

Answer.AI: Devin achieves ~15% success on complex real-world tasks

14 failures, 3 successes, 3 inconclusive across 20 tasks. 'Tasks it can do are so small and well-defined that I may as well do them myself.'

Named researchers, detailed methodologyAnswer.AI (independent research)

strong2025-09

Cognition acquires Windsurf, raises to $10.2B valuation

Highest-valued player in the autonomous coding category. Windsurf acquisition adds IDE and 350+ enterprise customers.

CNBC (financial press)CNBC (independent)

moderate2026

Devin ARR estimated at $150M+ — Sacra

$150M+ estimated ARR including Windsurf's $82M ARR. Goldman Sachs piloting 'hundreds to thousands of Devins.'

Market research estimateSacra (independent market research)

moderateSelf-reported2026

67% merge rate on defined tasks — self-reported

67% PR merge rate on well-defined tasks. Contradicted by Answer.AI's 15% on complex tasks. Gap suggests capability limited to narrow, well-scoped work.

Product page claimCognition (self-reported)

strong2026-01

Devin price slash: $500/mo → $20/mo — 96% cut signals competitive pressure

Core $20/mo (~9 ACUs at $2.25/ACU), Teams $500/mo (250 ACUs at $2.00/ACU). The old $500/mo was untenable against Claude Code and Cursor. Largest price cut in the category.

VentureBeat tier-1 coverageVentureBeat (independent)

Raw GitHub source

GitHub README could not be fetched right now.