Chrome DevTools MCP wins on Chrome debugging (Web Vitals, CPU emulation, performance traces), HN engagement (585 vs 189 pts), and has slightly more stars (29.7K vs 29K). Playwright MCP wins on cross-browser (Firefox, Safari, Edge), raw downloads (1.38M vs 423K), and battle-tested maturity. They're complementary — use DevTools for debugging, Playwright for cross-browser testing.
Web Browsing / Browser Automation
The category has split into four lanes: full-autonomy agents (Browser Use, Skyvern), MCP/CLI tools for coding agents (Chrome DevTools MCP, Playwright MCP, Vercel Agent Browser), frameworks/SDKs for building pipelines (Browser Use, Stagehand), and consumer agentic browsers (BrowserOS). Chrome DevTools MCP is the current Lane 2 leader after a 598-point HN thread (Mar 15, 2026). Browser Use hits 1M+ weekly PyPI downloads — uncontested in Lane 3.
9
Ranked
8
Signals
Verdict
Browser Use is the unchallenged #1 for full autonomous browser agents — 81K stars, 990K weekly PyPI downloads, 89.1% WebVoyager (Steel.dev), SOC 2 Type II, own cost-optimized model (BU-30B). The only mature option when the LLM needs complete control over unpredictable web workflows.
Chrome DevTools MCP is the #1 for coding-agent MCP workflows — 30K+ stars (more than Playwright MCP), Google Chrome team backing, 598-point HN thread (highest in category, Mar 15 2026, 231 comments), CyberAgent production case study across 236 Storybook stories. Standalone CLI mode in v0.20.0. 1.2K/wk skills.sh installs.
Playwright MCP remains the cross-browser standard — 1.38M weekly npm downloads (highest in category), Microsoft backing, but token bloat (4.2x vs CLI) is well-documented. Microsoft now recommends CLI for coding agents, MCP for sandboxed environments. CLI-over-MCP is emerging consensus.
Vercel Agent Browser is the token efficiency and production download leader in Lane 2 — 23.4K stars, 109.7K weekly npm downloads (highest download velocity in Lane 2 by order of magnitude), 82-93% context reduction independently verified. Best for Vercel AI SDK workflows.
The deeper read
The category has split into four functional lanes: (1) Full autonomous agents — LLM controls the entire browser loop (Browser Use, Skyvern). (2) Surgical MCP/CLI tools — agent invokes specific browser actions through structured tool calls (Chrome DevTools MCP, Playwright MCP, Vercel Agent Browser). (3) Frameworks/SDKs for building browser agent pipelines (Browser Use for Python, Stagehand for TypeScript, Skyvern for hosted). (4) Consumer agentic browsers — Chromium forks with built-in AI (BrowserOS).
Token efficiency is the new battleground. Raw capability is table stakes — what matters now is how many tokens a tool burns per action. Vercel Agent Browser's 93% context reduction signals where the category is headed.
MCP is the interface standard. Every serious contender either ships an MCP server or is building one. Tools without MCP support are increasingly irrelevant to coding agent workflows.
Current ranking
Best for: Full autonomous web browsing where the LLM needs complete control over unpredictable workflows
81K stars, 990K weekly PyPI downloads. 89.1% WebVoyager (Steel.dev). SOC 2 Type II certified. Cloud MCP, Skills API, own cost-optimized model (BU-30B). 10 releases in 5 weeks, 97 contributors.
⚡ Python-only limits TypeScript/Node ecosystems. Production stability complaints persist. rtrvr.ai showed only 43.9% success in cloud mode (vs Skyvern 64.4%). No named enterprise case study despite SOC 2.
Best for: Coding agents that need browser access — deepest recent HN validation, official Google backing
30K+ stars (more than Playwright MCP at 29.2K). Google Chrome team official — guarantees CDP protocol alignment. 598-point HN thread (highest in category, Mar 15 2026, 231 comments). 1.2K/wk skills.sh installs. CyberAgent production case study across 236 Storybook stories. 26 tools including unique Web Vitals and CPU emulation. Standalone CLI mode in v0.20.0, weekly releases.
⚡ Chrome-only (no Firefox/Safari/Edge). Debugging-focused — complements rather than replaces general automation. 26 tools / 18K token schema overhead.
Best for: Cross-browser UI testing and structured automation in TypeScript/Node agent stacks
1.38M weekly npm downloads — highest in category. Microsoft backing. Cross-browser (Chrome, Firefox, WebKit, Edge). Auto-configured in GitHub Copilot Coding Agent.
⚡ Token bloat: 114K tokens/session vs 27K via CLI (4.2x). Better Stack: CLI completed all 7 steps, MCP hit TimeoutError. Microsoft now recommends CLI for coding agents, MCP for sandboxed environments. CLI-over-MCP is emerging consensus.
Best for: Token-efficient browser automation for coding agents and Vercel AI SDK workflows
23.4K stars. 109.7K weekly npm downloads — highest production download velocity in Lane 2 by order of magnitude. 82-93% context reduction independently verified. Rust core with sub-50ms boot. Snapshot + Refs architecture. Unambiguous #1 for teams already on Vercel AI SDK.
⚡ Young project — 2 months old. Chrome-only. No high-engagement HN thread — growth appears Vercel-ecosystem-driven. Narrower scope than Chrome DevTools MCP (optimized for Vercel AI SDK workflows, not arbitrary MCP clients).
Best for: Surgical AI-powered browser actions with deterministic control flow
21.6K stars, 2.46M monthly npm downloads (highest monthly volume in category). $67.5M funding (Browserbase). Auto-caching: once an action succeeds, selector cached and replayed without LLM calls — approaching $0 for repeated tasks. v3 is 44% faster with direct CDP. Cloudflare integration.
⚡ NxCode measured ~75% task completion vs Browser Use ~78%. TypeScript-only (no Python). Download anomaly (2.4M downloads vs 21K stars) is unexplained — could be CI/CD inflation.
Best for: Enterprise workflow automation on websites without APIs — form filling, procurement, data entry
20.8K stars. YC S23 + $2.7M raised. Vision-LLM handles never-before-seen websites. CAPTCHA, 2FA, proxy networks. 422-point HN peak. Strongest sustained HN presence (7 threads, two 300+ pt). rtrvr.ai: 64.4% success vs Browser Use Cloud 43.9% — wins on reliability.
⚡ AGPL-3.0 license is dealbreaker for many enterprises (ironic given enterprise positioning). PyPI only ~650/wk (most usage is hosted platform). WebVoyager 85.85% below Browser Use (89.1%).
Best for: High-performance headless browser engine for AI agent infrastructure
20.7K stars. 11x faster, 9x less memory vs Chrome headless. Three HN threads (combined 700+ pts, 467 comments). Zig-based, CDP-compatible. Now has MCP server (gomcp).
⚡ Infrastructure layer, not user-facing. Still beta (v0.2.x, ~5% site breakage). AGPL-3.0 license.
Best for: Cross-browser agent automation (Firefox + Safari) — only standards-first W3C option in Lane 2
443-point HN (Dec 2025) — independently verified from Selenium (2004) and Appium (2012) creator. WebDriver BiDi W3C multi-vendor standard vs Google-controlled CDP. Near-daily releases (v26.3.18 on 2026-03-18). Only Lane 2 tool with credible Firefox/Safari cross-browser agent story.
⚡ Only 2.7K stars and 740/wk PyPI downloads. 4 contributors. Creator's own characterization: 'just v1, good for experimenting, not for production yet.' Not recommended for production workloads.
Best for: Consumer agentic browser — open-source alternative to ChatGPT Atlas, Perplexity Comet, Dia
9,982 stars. YC-backed. 314 HN pts (Jun 2025) + 88 pts (Jan 2026). 47 releases in ~10 months — weekly cadence demonstrates not vaporware. v0.43.0 (2026-03-12) adds Skills/Memory/SOUL.md. v0.42.0 added MCP server with 31 tools. Supports local models (Ollama, LMStudio).
⚡ Lane 4 (consumer agentic browser) — different use case than automation SDK. ~10 contributors despite YC backing. AGPL-3.0 blocks enterprise. HN flagged prompt injection risk.
See the full comparison.
Stars, downloads, evidence — all skills side by side.
Skills comparison
GitHub stars and evidence count for top ranked skills.
GitHub Stars
Evidence items
Star growth over time
GitHub stars trajectory for top skills in this category.
GitHub Stars
Head to head
Browser Use gives the LLM full control (re-reasons every step). Stagehand gives AI element selection but keeps the developer in control flow. Browser Use for unpredictable workflows; Stagehand for deterministic, repeatable tasks. Browser Use 4x the traction (81K vs 21.5K stars) but Python-only.
Agent Browser delivers 82-93% context reduction — verified by two independent sources. Playwright MCP has higher downloads (1.38M vs 284K) and years of stability. If token costs matter, Agent Browser first. If cross-browser stability and ecosystem matter, Playwright MCP. Agent Browser is 65 days old. At 353 stars/day it passes Playwright MCP within 3 weeks.
Browser Use: general-purpose autonomous browsing (81K stars, 941K downloads, MIT, 89.1% WebVoyager). Skyvern: enterprise workflow automation (21K stars, AGPL, YC-backed, 85.85% WebVoyager). Browser Use for developer/coding workflows; Skyvern for business process automation with CAPTCHA/2FA needs.
Missing a contender?
If there's a skill we haven't ranked, submit it.
Public signals
30K+ stars (more than Playwright MCP at 29.2K). 598-point HN thread with 231 comments (2026-03-15) — highest HN score of any tool in the category. 1.2K/wk skills.sh installs. CyberAgent production case study across 236 Storybook stories. Standalone CLI mode in v0.20.0. v0.20.1 shipped 2026-03-17.
23.4K stars. 109.7K weekly npm downloads — highest production download velocity in Lane 2 by order of magnitude. 82-93% context reduction independently verified (paddo.dev 82%/5.6x, DEV.to 16x on 10-step flows). Snapshot + Refs architecture. Unambiguous #1 for Vercel AI SDK teams.
BU 2.0 jumped from 74.7% to 83.3% accuracy, matching Claude Opus 4.5 while being 40% faster. Beats Gemini 3 Pro (81.7%) and GPT-5.2 (70.9%). 1,019,707 weekly PyPI downloads — 1M+ threshold crossed. Uncontested Lane 3 Python leader.
Playwright MCP ships pre-configured in GitHub Copilot's Coding Agent — no setup required. The accessibility-snapshot approach gives Copilot browser eyes for testing and debugging. Institutional default by Microsoft.
Stagehand v3 rewrites the core to talk directly to browsers via Chrome DevTools Protocol. 20-40% faster across act, extract, and observe operations. Enhanced extraction targeting iframes and shadow roots.
Now at v1.x with weekly releases. 422-point HN peak. 85.85% WebVoyager (Steel.dev). Enterprise-grade features (CAPTCHA, 2FA, proxy, geo-targeting) that no other OSS tool offers. AGPL-3.0 license is a concern for commercial use.
Steel.dev published WebVoyager benchmark results across 586 tasks. Browser Use leads open-source at 89.1% (AIME-Browser-Use variant at 92.34%). Skyvern at 85.85%. Commercial leaders: Surfer 2 (97.1%), Magnitude (93.9%), Smooth (92%). First comprehensive public benchmark for the category.
v0.43.0 shipped 2026-03-12 adding Skills, Memory, SOUL.md agent primitives. v0.42.0 added MCP server with 31 tools. YC-backed, 9,982 stars, AGPL-3.0, 47 releases. Weekly cadence confirms not vaporware. Only credible open-source alternative to Atlas/Comet/Dia in the consumer agentic browser lane.
What changes this
If Browser Use publishes a named enterprise case study, it strengthens #1 — currently the biggest evidence gap despite SOC 2.
If Vercel Agent Browser gets a high-engagement HN thread (300+ pts), it could move to #3 or #2 — currently lacks organic community validation.
If Vercel Agent Browser adds cross-browser support, it's a major threat to Playwright MCP's #3 position.
If Playwright team ships native token compression in MCP, it undermines Agent Browser's core thesis and Playwright MCP moves up.
If Browser MCP (6K stars, 616 HN pts) development resumes, immediately re-evaluate — real-browser-profile approach + highest single HN score in category.
If Stagehand download anomaly is explained (genuine production depth), it moves to #4 — download numbers are category-leading.
If UI-TARS Desktop (28.9K stars, ByteDance) narrows to browser-only, enters top 5 immediately.
If Magnitude crosses 10K stars, enters Tier 2 — 93.9% WebVoyager is best-in-class accuracy.
If AGPL projects (Skyvern, Lightpanda) relicense to MIT/Apache, both move up — license is their biggest adoption barrier.
If CLI-over-MCP becomes consensus, restructures entire ranking — CLI-first tools rise, MCP-first tools fall.