All Categories

15 categories, each with ranked contenders and public evidence. A category is the narrow thing the agent needs to do.

Categories

Coding CLIs / Code Agents

The hottest category right now. Ten+ serious CLI agents competing across three tiers. SWE-bench Pro (standardized) is necessary but no longer sufficient — METR found ~50% of SWE-bench-passing PRs would NOT be merged by real maintainers. Rankings weight benchmarks alongside practical tests, adoption, safety, and independent evaluations.

Ranking

01Claude Code—Architecture, planning, complex reasoning, security analysis, niche languages

02Codex CLI—OpenAI ecosystem, locked-down environments, token efficiency, sandbox-first safety

03Gemini CLI—Budget-constrained developers, large-context tasks, free entry point

Open full report →

Web Browsing / Browser Automation

The category has split into four lanes: full-autonomy agents (Browser Use, Skyvern), MCP/CLI tools for coding agents (Chrome DevTools MCP, Playwright MCP, Vercel Agent Browser), frameworks/SDKs for building pipelines (Browser Use, Stagehand), and consumer agentic browsers (BrowserOS). Chrome DevTools MCP is the current Lane 2 leader after a 598-point HN thread (Mar 15, 2026). Browser Use hits 1M+ weekly PyPI downloads — uncontested in Lane 3.

Ranking

01Browser Use—Full autonomous web browsing where the LLM needs complete control over unpredictable workflows

02Chrome DevTools MCP—Coding agents that need browser access — deepest recent HN validation, official Google backing

03Playwright MCP—Cross-browser UI testing and structured automation in TypeScript/Node agent stacks

Open full report →

Product / Business Development

Seven distinct lanes now confirmed by independent traction data: Research/Extraction (Firecrawl, Exa), Enterprise Operating Surface (mcp-atlassian), Startup Operating Surface (Notion), Business Automation (Zapier, new), Product Analytics (PostHog, new), CRM (HubSpot #1, Salesforce #2), and Project/PM (Linear, upgraded). Slack's previously-cited metrics are unverified — flagged for re-check.

Ranking

01Firecrawl MCP Server—Structured web extraction, competitive research, lead enrichment, content ingestion into agent pipelines

02MCP Atlassian—Enterprise product teams on Jira + Confluence — sprint planning, issue tracking, documentation

03Notion MCP Server—Startup and cross-functional product teams using Notion — specs, roadmaps, wikis, databases

Open full report →

Teams of Agents / Multi-Agent Orchestration

Four distinct buyer segments with almost no cross-over: (1) Agent frameworks/SDKs — build multi-agent systems in code (LangGraph, CrewAI, OpenAI Agents SDK, Mastra); (2) Autonomous coding agents — delegate software development to an agent (OpenHands, Factory AI); (3) Parallel agent IDEs — run multiple coding agents simultaneously and compare results (Emdash, Superset); (4) Workflow automation with agents — orchestrate integrations visually (n8n). Ranking all on a single list is misleading — each serves a different buyer.

Ranking

01OpenHands—End-to-end autonomous coding platform — self-hostable, model-agnostic, enterprise-validated

02Emdash (YC W26)—Multi-agent orchestration with Best-of-N comparison and issue-tracker integration (Linear, Jira, GitHub Issues)

03Superset—Simple, privacy-respecting parallel agent execution — the ‘tmux for agents’ buyer on macOS

Open full report →

UX / UI

Four lanes: (1) trust leader (Official, zero CVE, triple AI partnership), (2) enterprise write-access (Console MCP, Uber uSpec), (3) community read-only default (Framelink — ⚠️ CVE patched, use ≥v0.6.3), (4) design-in-code (Onlook — 24,918 stars, bypasses Figma entirely for Next.js+Tailwind teams). Cursor marketplace listing elevates Grab to #5.

Ranking

01Figma MCP Server Guide—Enterprise teams on Figma Professional/Organization with Code Connect configured

02Figma Console MCP—Enterprise design automation, spec generation at scale, free-tier teams needing write access

03Framelink / Figma-Context-MCP—Individual developers and free-tier users wanting the simplest read-only Figma MCP on ≥v0.6.3

Open full report →

Software Factories

Autonomous coding agents that plan, write, test, and ship code with minimal human oversight. The category has split into distinct lanes: platform-integrated (Copilot), event-driven always-on (Cursor Automations), open-source (OpenHands), enterprise-managed (Factory), and standalone SaaS (Devin/Windsurf). Production safety incidents (Kiro, Replit) are now a category-defining concern alongside benchmark scores.

Ranking

01GitHub Copilot Coding Agent—Teams already on GitHub that want zero-friction async coding with enterprise compliance baked in

02Cursor Automations—Teams already on Cursor who want event-driven automated coding triggered by external events (Slack, PagerDuty, Linear, webhooks, cron)

03OpenHands—Regulated industries, privacy-sensitive orgs, teams wanting to self-host with their own models

Open full report →

Search & News

Web search, scraping, and deep research tools for AI agents. The category has split into three lanes: search APIs (Brave, Exa, Tavily), scrape/crawl tools (Firecrawl, Crawl4AI), and deep research APIs (Parallel, Perplexity Sonar). Most serious agent workflows need tools from the first two lanes. MCP support is table stakes — the real differentiators are benchmark quality, latency, index independence, and license.

Ranking

01Brave Search API—Default search API for AI agents — fastest, broadest MCP tooling, independent benchmark winner

02Firecrawl—Web scraping, structured extraction, turning messy pages into LLM-ready content — the research/extraction workhorse

03Exa—Semantic search, similarity search, market mapping — strongest where traditional keyword search fails

Open full report →

Marketing

Skills for SEO, content optimization, ad copy, social media calendars, competitor analysis, and growth automation.

Ranking

Open full report →

Business

Skills for pitch decks, financial modeling, contract review, OKR frameworks, invoicing, and business operations.

Ranking

Open full report →

Content & Writing

Skills for blog posts, newsletters, technical writing, style guide enforcement, and editorial workflows.

Ranking

Open full report →

Research

Skills for literature review, market research, patent analysis, academic workflows, and structured research pipelines.

Ranking

Open full report →

Automation

Skills for bot building, MCP bridges, workflow automation, and connecting Claude to external services.

Ranking

Open full report →

Security

Skills for SAST scanning, secret detection, dependency auditing, accessibility checks, and security guardrails.

Ranking

Open full report →

Documentation

Skills for API docs generation, README crafting, changelog writing, PDF reports, and documentation automation.

Ranking

Open full report →

Data & Analytics

Skills for data cleaning, ML training loops, chart building, CSV pipelines, and analytics workflows.

Ranking

Open full report →