skillpack.co

All Problems

23 problem spaces, each with ranked solutions and public evidence. A problem is the narrow thing the agent needs to solve.

23

Problems

Coding CLIs / Code Agents

The hottest category right now. Ten+ serious CLI agents competing across three tiers. SWE-bench Pro (standardized) is necessary but no longer sufficient — METR found ~50% of SWE-bench-passing PRs would NOT be merged by real maintainers. Rankings weight benchmarks alongside practical tests, adoption, safety, and independent evaluations.

Ranking

01Claude CodeComplex multi-file refactors, framework migrations, architecture — any task where first-pass quality matters most
02Gemini CLIBudget-conscious developers, students, exploratory/prototyping work, massive context window tasks
03GitHub Copilot CLITeams already on GitHub Copilot, enterprise environments requiring governance and audit trails

Open full report →

Web Browsing / Browser Automation

The category has split into four lanes: full-autonomy agents (Browser Use, Skyvern), MCP/CLI tools for coding agents (Chrome DevTools MCP, Playwright MCP/CLI, Vercel Agent Browser), frameworks/SDKs for building products (Stagehand), and consumer agentic browsers (BrowserOS). CLI-over-MCP is settled consensus (13+ independent sources). Browser Use hits 1M+ weekly PyPI downloads — unchallenged in Lane 1. BrowserOS just crossed 10K stars as the Lane 4 leader.

Ranking

01Browser UseFull autonomous web browsing where the LLM needs complete control over unpredictable workflows
02Chrome DevTools MCPLane 2 leader for Chrome debugging workflows — fastest star growth in category
03Playwright MCP / CLICross-browser automation with token-efficient CLI companion — owns both sides of the CLI-over-MCP divide

Open full report →

Product / Business Development

Eight distinct lanes confirmed by independent traction data: Research/Extraction (Firecrawl #1, Exa #2), Enterprise Operating Surface (mcp-atlassian #1, Rovo #2), Startup Operating Surface (Notion), CRM (HubSpot #1, Salesforce #2, Dynamics 365 #3 NEW), Business Automation (Zapier #1, n8n watch), Product Analytics (PostHog #1, Amplitude #2 NEW, Mixpanel watch), Project/PM (Linear #1, Monday watch, Asana watch), and Communication (Slack, real but early).

Ranking

01Firecrawl MCP ServerStructured web extraction, competitive research, lead enrichment, content ingestion into agent pipelines
02MCP AtlassianEnterprise product teams on Jira + Confluence — sprint planning, issue tracking, documentation
03Notion MCP ServerStartup and cross-functional product teams using Notion — specs, roadmaps, wikis, databases

Open full report →

Teams of Agents / Multi-Agent Orchestration

Five distinct segments with almost no cross-over: (1) Python agent frameworks — build multi-agent systems in code (LangGraph #1, OpenAI Agents SDK #2, Pydantic AI #3, CrewAI #4, plus cloud-native: Strands/AWS, ADK/GCP, Semantic Kernel/Azure); (2) TypeScript framework — Mastra (no competitor); (3) Autonomous coding agents — delegate software development to an agent (OpenHands, Factory AI); (4) Parallel agent IDEs — run multiple coding agents simultaneously (Emdash, ccpm, Superset); (5) Workflow automation — orchestrate integrations visually (n8n, Sim Studio). Ranking all on a single list is misleading — each serves a different buyer.

Ranking

01OpenHandsEnd-to-end autonomous coding platform — self-hostable, model-agnostic, enterprise-validated
02Emdash (YC W26)Multi-agent orchestration with Best-of-N comparison and issue-tracker integration (Linear, Jira, GitHub Issues)
03ccpmShell-based parallel agent execution using GitHub Issues + git worktrees — pragmatic, no unnecessary complexity

Open full report →

UX / UI

Five lanes: (A) read-only Figma context (Official #1, Framelink #2), (B) bidirectional write-access (Console MCP #1, Grab #2, figma-use #3), (C) alternative platforms (Penpot, Excalidraw), (D) specialized design-to-code agents (Kombai — 75–80% fidelity), (E) AI-native design creation (Google Stitch — Figma stock -8.8%, Onlook). Uber uSpec remains strongest enterprise validation. Google Stitch is provisional (2 days old).

Ranking

01Figma MCP Server GuideTeams with Figma Professional/Enterprise and Code Connect configured — the 'batteries included' choice
02Framelink / Figma-Context-MCPIndividual developers, free-tier Figma users, and teams with custom codebases where descriptive metadata is more useful than prescriptive code
03Figma Console MCPDesign system teams managing variables, tokens, specs, and multi-platform documentation at scale

Open full report →

Software Factories

Autonomous coding agents that plan, write, test, and ship code with minimal human oversight. Claude Code leads on benchmarks, community signal, and platform distribution (Apple Xcode). Cursor leads on revenue and event-driven automation. Gemini CLI is the free-tier disruptor. The category has split into CLI-first (Claude Code, Codex CLI, Gemini CLI), IDE-integrated (Cursor, Copilot, Cline), open-source (OpenHands), and enterprise-managed (Augment, Factory). SWE-bench Verified is dead — Pro is the new standard.

Ranking

01Claude Code (Anthropic)Developers who want the most capable coding agent — complex multi-file refactors, greenfield projects, long-running autonomous execution
02Cursor (Anysphere)Teams that want an all-in-one IDE with autonomous background agents AND event-driven automation (Slack → agent writes PR → human reviews)
03Gemini CLI (Google)Budget-conscious developers, Google Cloud/Android shops, and anyone wanting a capable free coding CLI

Open full report →

Search & News

Web search, scraping, and deep research tools for AI agents. The category has split into three lanes: search APIs (Brave, Exa, Tavily), scrape/crawl tools (Firecrawl, Crawl4AI), and deep research APIs (Parallel, Perplexity Sonar). Most serious agent workflows need tools from the first two lanes. MCP support is table stakes — the real differentiators are benchmark quality, latency, index independence, and license. Deep research lane is still immature — WideSearch academic benchmark shows near 0% success on broad tasks.

Ranking

01Brave Search APIDefault search API for AI agents — fastest, broadest MCP tooling, independent benchmark winner
02FirecrawlWeb scraping, structured extraction, turning messy pages into LLM-ready content — the research/extraction workhorse
03ExaSemantic search, similarity search, market mapping — strongest where traditional keyword search fails

Open full report →

Marketing

Skills for SEO, content optimization, ad copy, social media calendars, competitor analysis, and growth automation.

Ranking

01Jasper AIBrand-governed marketing teams of 3-10 people
02HubSpot Breeze AITeams already on HubSpot Professional/Enterprise
03Copy.aiMarketing workflow automation without enterprise pricing

Open full report →

Business

Skills for pitch decks, financial modeling, contract review, OKR frameworks, invoicing, and business operations.

Ranking

#1GammaAI pitch decks (default choice)
#2SpellbookAI contract review (law firms)
#3LinkSquaresEnterprise legal ops / CLM

Open full report →

Content & Writing

Skills for blog posts, newsletters, technical writing, style guide enforcement, and editorial workflows.

Ranking

01ValeDocs-as-code style enforcement in CI/CD pipelines
02HarperPrivacy-first, on-device grammar checking for developers
03Copy.aiMarketing content workflow automation at accessible pricing

Open full report →

Research

Deep research agents, academic tools, and research infrastructure. The category has split: platform deep research (Perplexity, OpenAI, Google), open-source agents (GPT Researcher, Tongyi, STORM), academic specialists (Elicit, Consensus), and infrastructure (Tavily, Firecrawl). Speed, citation quality, and self-hosting are the real differentiators.

Ranking

01Perplexity Deep ResearchSpeed-sensitive research, daily use, citation-critical work
02OpenAI Deep ResearchExpert-level reasoning, complex multi-step research, enterprise MCP workflows
03Google NotebookLM + Deep ResearchMultimodal research, long-document analysis, research-to-presentation pipelines

Open full report →

Automation

Three sub-categories with distinct buyers: Workflow Automation (n8n #1, Activepieces, Zapier), Code-First Orchestration (Windmill, Trigger.dev, Inngest, Kestra), Agent Integration (Composio, Pipedream MCP). n8n dominates overall with 180K stars, n8n-mcp (15.4K stars), and dedicated Claude Code skills. Inngest leads npm downloads (499K/wk). Composio is provisional — zero HN traction despite 27K stars.

Ranking

01n8nTechnical teams building complex, multi-step automations with AI agents — the default choice for Claude Code users needing workflow automation
02Composio (provisional)AI agent developers needing managed auth across dozens of services — use inside n8n or alongside other workflow tools
03KestraData engineering and DevOps teams migrating from Airflow — YAML-declarative workflows with Git version control

Open full report →

Security

Skills for SAST scanning, secret detection, agent/MCP security scanning, and offensive security. The category splits into four sub-themes: SAST/code scanning (Semgrep MCP #1), secret detection (GitGuardian MCP #1), agent/MCP security scanning (Snyk Agent Scan #1), and offensive security (HexStrike AI #1). Agent security scanning is the fastest-growing sub-theme — these tools scan your agents, skills, and MCP servers, not your application code.

Ranking

01Semgrep MCPOSS SAST scanning with official MCP integration — the default recommendation for code security in agent workflows
02Snyk Agent ScanEnterprise agent/MCP security scanning — scans your agents, skills, and MCP servers for prompt injection, tool poisoning, toxic flows
03GitGuardian MCP (ggmcp)Purpose-built secret scanning for agent workflows — 500+ detectors with hard merge gates

Open full report →

Documentation

Docs-as-code frameworks, API documentation generators, documentation SaaS platforms, and documentation automation tools. The category splits into four overlapping lanes: OSS docs frameworks (Fumadocs #1, Starlight #2, Docusaurus #3), API docs (Fern #4, Redocly #6, Swagger UI #8), SaaS platforms (Mintlify #5, GitBook #10), and automation (Promptless #9). Fumadocs is the momentum winner — 3x YoY growth and 5-open-issues maintenance make it the clear pick for Next.js teams.

Ranking

01FumadocsNext.js teams wanting the fastest-growing, best-maintained docs framework with built-in OpenAPI rendering
02Astro StarlightContent-first static docs with zero client-side JS — the DX leader for non-Next.js stacks
03DocusaurusLarge OSS projects needing versioned docs and a massive plugin ecosystem — the safe default

Open full report →

Data & Analytics

AI-powered data analysis tools, reactive notebooks, BI-as-code platforms, conversational data agents, and ML training aids. The category has split into reactive notebooks (Marimo), AI visualization (Data Formulator), BI-as-code (Evidence, Observable), app deployment (Streamlit), conversational data (PandasAI), and prompt-to-ML (Plexe). Marimo is the clear #1 — strongest combined signal across stars, downloads, HN attention, and independent validation.

Ranking

01MarimoExploratory analysis with reproducibility guarantees — the Jupyter replacement for agent-assisted data workflows
02Data FormulatorAI-powered iterative data visualization — describe what you want, the agent builds and refines the chart
03EvidenceSQL-first analysts who want version-controlled, code-authored BI reports — no JS/Python required

Open full report →

Personal Assistants

How do I interact with AI for everyday tasks? ChatGPT, Claude, Gemini, and emerging contenders like OpenClaw.

Ranking

Open full report →

Memory Systems

How do agents remember context across sessions? Vector DBs, context management, and persistent memory for AI workflows.

Ranking

Open full report →

Performance

How do I profile, benchmark, and optimize AI/agent workloads? Speed and efficiency tools for production deployments.

Ranking

Open full report →

Analytics & LLM Tracing

How do I observe and trace LLM calls and agent runs? PostHog, Braintrust, LangSmith, Helicone, and more.

Ranking

Open full report →

Web Development & UI Frameworks

How do I build AI-powered UIs? Frontend frameworks and tools for shipping AI products — Vercel AI SDK, Streamlit, v0, Bolt, Lovable.

Ranking

Open full report →

Agent Harnesses

How do I orchestrate and run agents? Frameworks for building, deploying, and managing AI agent workflows — LangChain, CrewAI, Pydantic AI, Claude Agent SDK.

Ranking

Open full report →

Knowledge Management

How do I organize and retrieve team knowledge? Notion, Google Workspace, Obsidian — the foundations MCP tools build on.

Ranking

Open full report →

AI Adoption & Best Practices

How do I adopt AI effectively? Meta-tracking, best practices, ecosystem navigation. SkillPack itself lives in this problem space.

Ranking

Open full report →