skillpack.co
All solutions

Devin (Cognition)

active

Pioneered the async autonomous coding agent category. $10.2B valuation, ~$900M total funding, $150M+ ARR (incl. Windsurf). Enterprise customers: Goldman Sachs, Santander, Nubank. Independent eval (Answer.AI): 15% success on complex tasks.

Score 41
Devin (Cognition) in action

Shipped last month

Devin 2.2 enters coding-clis at #9

SaaS autonomous agent with 67% PR merge rate (up from 34% YoY), GUI computer use, and Devin Review self-QA. Different workflow (web dashboard + Slack) but addresses the same problem space. · 2026-03-25

Where it wins

$10.2B valuation, ~$900M total funding — highest-funded in category

$150M+ combined ARR (incl. Windsurf acquisition)

Enterprise customers: Goldman Sachs, Santander, Nubank

67% PR merge rate on well-defined tasks

530 HN pts on launch + 502 pts on Windsurf acquisition — high engagement

Where to be skeptical

No public benchmark scores — red flag for transparency

Answer.AI independent eval: 15% success rate (3/20 tasks) — only rigorous independent test

HN sentiment predominantly skeptical ('overhyped,' 'demo-driven')

Windsurf acquisition caused customer exodus (pricing changes, trust erosion)

ACU pricing model is opaque and potentially expensive

96% price cut ($500$20) signals massive competitive pressure

Editorial verdict

Highest-funded pure-play, but the gap between self-reported (67% merge rate) and independent results (15% success) is the defining data point. Business metrics are strong; product evidence on complex tasks is weak.

Videos

Reviews, tutorials, and comparisons from the community.

Devin AI Explained for Beginners (AI Coding Assistant for Software Engineers)

The Cutting Edge School·2025-07-29

Devin 2.0: First-Ever AI Software Engineer IS TRULY INSANE! (Devin IDE, CLI Coder, & More!)

WorldofAI·2025-08-11

I Tried Devin (AI Software Engineer) — Full Review vs Cursor & Copilot

Cloud Champ·2025-07-02

Related

Claude Code

98

Anthropic's official agentic coding CLI. v2.1.81 (Mar 20) shipped `--bare`, smarter worktree resume, and improved MCP OAuth while the repo crossed 82,204 stars and logged ~14 commits/week across 10+ maintainers. Terminal-native, tool-use-driven, with deep file system + shell access, #1 SWE-bench Pro standardized (45.89%), ~4% of GitHub public commits (SemiAnalysis), $2.5B annualized revenue. 8M+ npm weekly downloads. Opus 4.6 with 1M context.

OpenHands

88

Category leader in multi-agent orchestration — 69,352 stars (verified), $18.8M Series A, AMD hardware partnership, 455 contributors, 1M downloads/month PyPI (3.4M all-time). SWE-Bench Verified 72% with Claude 4.5 Extended Thinking (updated 2026-03-19), Multi-SWE-Bench #1 across 8 languages. Gap to #2 is enormous on every axis.

Gemini CLI

88

Google's open-source terminal agent with Gemini 3 models, 1M token context, built-in Google Search grounding, and the best free tier in the category (60 req/min, 1K req/day). v0.35.0 (Mar 24) shipped keybinding, policy, and telemetry fixes while the repo hit 98,957 stars and 12,593 forks. Terminal-Bench 2.0: 78.4% (#1). SWE-bench Pro standardized 43.30% (#3). Plan Mode added March 2026. First-pass correctness ~50-60% (Educative.io).

Codex CLI

87

OpenAI's open-source coding agent built in Rust. Terminal-Bench 77.3% (#2), SWE-bench Pro standardized 41.04% (GPT-5.2-Codex). GPT-5.4 shipped March 5, 2026. Codex Security agent adds appsec capabilities. 3-4x more token-efficient than Claude Code, 240+ tokens/sec. Free with ChatGPT subscription, sandbox-first execution. 1M+ first-month users. Cleanest security record in Tier 1 — no documented incidents.

Public evidence

moderate2026
Devin ARR estimated at $150M+ — Sacra

$150M+ estimated ARR including Windsurf's $82M ARR. Goldman Sachs piloting 'hundreds to thousands of Devins.'

Market research estimateSacra (independent market research)
moderateSelf-reported2026
67% merge rate on defined tasks — self-reported

67% PR merge rate on well-defined tasks. Contradicted by Answer.AI's 15% on complex tasks. Gap suggests capability limited to narrow, well-scoped work.

Product page claimCognition (self-reported)

Raw GitHub source

GitHub README could not be fetched right now.