Search & News

Web search, scraping, and deep research tools for AI agents. The category has split into three lanes: search APIs (Brave, Exa, Tavily), scrape/crawl tools (Firecrawl, Crawl4AI), and deep research APIs (Parallel, Perplexity Sonar). Most serious agent workflows need tools from the first two lanes. MCP support is table stakes — the real differentiators are benchmark quality, latency, index independence, and license.

Ranked

Signals

Verdict

Brave Search is the strongest default — #1 on the only independent benchmark (AIMultiple 2026), fastest latency (669ms), most feature-complete MCP server (6 tools), SOC 2 Type II attested, 35K+ API customers, and 22 platform integrations.

Firecrawl is the scraping and extraction workhorse — 94,850 stars, #2 on benchmarks (highest relevance score), 1.17M weekly downloads, search + scrape + autonomous /agent endpoint in one tool. MCP server has 5,802 stars — highest in category.

Exa is the pick when semantic depth matters — neural embeddings, people/company/code verticals, strongest HN traction in category (412 pts), ~938K weekly downloads. Exa Instant claims sub-200ms (self-reported, Feb 2026) which would make it competitive with Brave if verified.

The deeper read

The category has split into three functional lanes: search APIs (find content), scrape/crawl tools (extract content from URLs), and deep research APIs (multi-step search + synthesis). Most agent workflows need lanes 1 and 2.

MCP support is table stakes — every serious contender has it. The real differentiators are: benchmark quality, latency, index independence, license, and cost.

The AIMultiple 2026 Agentic Search Benchmark (100 queries, GPT-5.2 judge) is the only independent multi-tool comparison. Brave #1, Firecrawl #2, Exa #3, Parallel #4, Tavily #5. Top-4 statistically indistinguishable.

Current ranking

Brave Search APIOfficial★ 797 64

Best for: Default search API for AI agents — fastest, broadest MCP tooling, independent benchmark winner

#1 Agent Score (14.89) in AIMultiple 2026. Fastest latency (669ms). Independent index (40B pages, not a Google wrapper). 6-tool MCP server (web, local, image, video, news, summarizer). SOC 2 Type II attested (Oct 2025). 35K+ API customers, 22 integrations (Snowflake, AWS, Cursor, Windsurf). Free tier: $5 monthly credit (~1,000 queries).

⚡ Less semantic depth than Exa on conceptual/research queries. MCP server stars (784) lag Exa (4,031) and Firecrawl (5,802). Free tier recently reduced from ~5,000 to ~1,000 queries. Search API is secondary to Brave's browser business.

FirecrawlOfficial★ 5.8K+ ↓91%70

Best for: Web scraping, structured extraction, turning messy pages into LLM-ready content — the research/extraction workhorse

94,850 GitHub stars — highest traction in category. Agent Score 14.58 (#2, highest relevance score in benchmark). 1.17M combined weekly downloads. MCP server: 5,802 stars (highest in category). Search + scrape + autonomous /agent endpoint — only tool covering all three lanes. FIRE-1 agent, Spark model family, parallel agents. Java SDK (Mar 2026), Rust PDF parser (Feb 2026), Browser Sandbox (Feb 2026). Multi-language SDKs (Python, JS, Go, Rust, Java). 95.3% success rate (Bright Data benchmark). Browser MCP: fastest extraction (7s, 83% success).

⚡ AGPL-3.0 license complicates enterprise embedding. $16/mo minimum for hosted. Latency (1,335ms) is 2x Brave's. HN traction surprisingly low for 94K stars — awareness spreads through Twitter/product channels, not HN.

ExaOfficial★ 4.0K+ 62

Best for: Semantic search, similarity search, market mapping — strongest where traditional keyword search fails

Agent Score 14.39 (#3), statistically tied with top tier. Best HN traction in category (412 pts). Highest MCP adoption among pure search APIs: 4,031 stars. ~938K weekly downloads (668K PyPI + 270K npm). SOC 2 Type II. Enterprise customers: Notion, Cursor, AWS, Databricks. Neural index: people (1B+ LinkedIn profiles), company, code verticals. Exa Instant sub-200ms (announced Feb 13, 2026); Exa Fast <350ms P50. $85M Series B at $700M, Nvidia-backed.

⚡ Exa Instant sub-200ms latency self-reported, not independently verified. $7/1K requests is pricier than Brave ($5/1K). Smaller MCP breadth (1 tool vs Brave's 6). Closed-source, cloud-only. Browser MCP benchmark: 23% extraction success (vs Bright Data 100%, Firecrawl 83%).

SearXNG★ 27K+ 78

Best for: Privacy-first, self-hosted meta-search — no API keys, no vendor lock-in, no cost

26,644 stars, active development (last commit 2026-03-15). Zero cost, zero API keys — aggregates 70+ search engines. Privacy guarantee: no query ever leaves your infrastructure. Rolling Docker releases. HN: 302 pts + 134 pts. AGPL-3.0.

⚡ No independent benchmark data — not tested in AIMultiple. Meta-search quality depends on upstream engines. Self-hosting overhead (Docker). No structured extraction — pure search only. No dedicated company or funding.

TavilyOfficial★ N/A 48

Best for: LangChain-native workflows where Tavily is the path of least resistance — fastest response time and highest uptime

Highest raw download volume (1.28M weekly). Default search tool in LangChain tutorials (142K weekly langchain-tavily). HumAI: fastest response (187ms), highest uptime (99.94%), 0.06% error rate. Acquired by Nebius for $275-400M. Fortune 500 customers.

⚡ Agent Score 13.67 (#5 of 8) — meaningful gap below top-4 cluster. Nebius acquisition (5 weeks old) introduces strategic risk — pricing and data policies could change. Near-zero HN traction (<10 pts). Quality and citations lag Exa (82% vs 85% accuracy, 85% vs 96% citations per HumAI). Watch Q2 for changes.

Jina ReaderOfficial★ 10K+ 43

Best for: Simplest URL-to-markdown conversion (one-line API) with ReaderLM-v2 for local extraction

10,248 stars. ReaderLM-v2 (1.5B SLM, 512K context, 29 languages) presented at ICLR 2025. Hosted API remains active. The r.jina.ai URL-prefix pattern is the simplest possible interface for single-page reads.

⚡ OSS repo stale — no commits for 10+ months (last commit May 2025). Firecrawl is a strict superset and 4-5x cheaper at volume. On downward trajectory. If no repo activity by mid-2026, recommend delisting.

Crawl4AI★ 62K+ 83

Best for: Free, open-source self-hosted crawling — Apache-2.0, no vendor dependency, full developer control

62,188 GitHub stars (#2 in category). Apache-2.0 license — best in category for enterprise embedding. Completely free. 372K weekly PyPI downloads. Actively maintained — v0.8.5 released 2026-03-18. Adaptive Intelligence (pattern-learning crawler). Local LLM support (Llama 3, Mistral).

⚡ Pre-1.0 maturity (v0.8.5). Zero HN stories above 10 pts despite 62K stars (anomalous — possible star inflation). Three independent comparisons show lower success rate (89.7% vs Firecrawl 95.3%) and higher noise (11.3% vs 6.8%). Python-only — no multi-language SDK.

Below the cut line

Parallel AI SearchOfficial38

Best for: Deep research where quality matters more than speed

Agent Score 14.21 (#4) — top tier on quality. Self-reported BrowseComp: 48%/58% accuracy vs GPT-4 browsing 1%. $740M valuation (Kleiner Perkins, Index Ventures). Founded by Parag Agrawal (ex-Twitter CEO).

⚡ Latency is 13,600ms — 20x slower than Brave, async-only viable. Pricing not public (~$300 CPM for Ultra). BrowseComp claims self-reported. Closed source, no MCP server, no public GitHub repo.

You.com Search APIOfficial28

Best for: OpenAI-native search — the provider OpenAI already uses

OpenAI integrated You.com as core search provider — strongest distribution signal in category. Self-reported: 93% SimpleQA, #1 DeepSearchQA. MCP server launched. 1,000 free API queries/month.

⚡ No independent benchmark data. Not in AIMultiple. No significant HN traction, no public GitHub presence. If independently verified, jumps to ranked list immediately.

Perplexity Sonar APIOfficial48

Best for: Highest raw answer accuracy (87% in HumAI) with citation synthesis

87% accuracy (highest in HumAI). 94% citation quality. Sonar Deep Research for multi-step retrieval. Official MCP server. Citation tokens no longer billed (Feb 2026).

⚡ Agent Score 12.96 (#7 of 8) — below SerpAPI. 11,000ms+ latency. BrowseComp 8% (vs Parallel 48%, Gemini 59%). Community reports ~50% Deep Research response truncation. Consumer brand doesn't translate to API performance.

LinkupOfficial28

Best for: AI-native web search API with sub-second speed and strong angel backing

$10M seed (Feb 2026, Gradient). Angels: Olivier Pomel (Datadog CEO), Arthur Mensch (Mistral CEO). Customers include KPMG, Artisan. /fast endpoint for sub-second search. MCP integrated with Claude Desktop.

⚡ Seed-stage — too early for ranking. Not in AIMultiple benchmark. 43 GitHub stars on SDK. Bold claims but zero independent verification.

Bright Data MCPOfficial★ 2.2K+ 51

Best for: Enterprise scraping behind aggressive anti-bot defenses — perfect accuracy where others fail

#1 Browser MCP Benchmark (AIMultiple 2026): 100% extraction success, 90% automation, 77% scalability. 2,214 stars. 60+ MCP tools — broadest MCP tooling. Industrial-grade anti-bot (CAPTCHA solving, proxy rotation, geo-unblocking). Free MCP tier. The only option for sites that actively block bots.

⚡ Not a search API — web access infrastructure. 30s extraction speed (Firecrawl is 7s). Enterprise pricing beyond free tier. Ethical concerns around anti-bot circumvention.

Hyperbrowser MCPOfficial★ 749 30

Best for: AI-native browser automation with stealth capabilities — Claude Computer Use / OpenAI Computer Use

90% browser automation (tied #1 with Bright Data, AIMultiple 2026). Stealth-first: CAPTCHA solving, IP rotation, fingerprint management. Supports Claude + OpenAI Computer Use agents. 63 HN pts on launch. 10,000 concurrent browsers.

⚡ GitHub repo stale since November 2025 (4+ months). 63% web extraction — lowest in benchmark. 118s speed — slowest. Only 5 contributors. No releases found. May be active on private repo.

ScrapeGraphAIOfficial★ 23K+ 68

Best for: LLM-graph-based extraction — describe what you want, AI builds the extraction pipeline

23,033 GitHub stars. 194 HN pts (strongest in category). Active development v1.74.0 (Mar 15, 2026). arXiv paper Feb 2026. Open-source + hosted API dual model. 'You Only Scrape Once' graph reuse. Apache-2.0 OSS.

⚡ Only 14,611 weekly PyPI downloads — 1,580:1 stars-to-downloads ratio (vs Firecrawl 81:1) suggests star inflation. ~$85/mo hosted for 10K structured extractions. No AIMultiple benchmark placement. Lower real adoption than star count suggests.

Valyu DeepSearchOfficial28

Best for: High-stakes knowledge work (finance, economics, medical) — if claims hold

Claims 94% SimpleQA and 79% FreshQA (vs Google 39%). 50+ proprietary data sources (SEC, clinical trials). a16z backed. LangChain integration. DeepSearch v2.0 with tool calling.

⚡ Almost all evidence is self-reported. Only one independent reviewer found. No HN traction. No AIMultiple entry. ~7 employees. Needs independent verification.

SerperOfficial28

Best for: Cheapest Google SERP access ($0.30/1K queries)

3-10x cheaper than alternatives. LangChain integration. 2,500 free searches on signup.

⚡ Pure Google SERP wrapper — no semantic understanding, no independent index. Budget pick only.

Spider Cloud★ 2.3K+ 38

Best for: High-volume scraping performance (Rust-based)

Claims 100K pages/sec, 7x Firecrawl throughput. MIT license. 2,332 stars. Rust-based zero-copy parsing. Cost advantage at scale (~$48/100K pages vs Firecrawl ~$240).

⚡ Tiny community (2.3K stars). Benchmark claims entirely self-reported.

Google Grounding with SearchOfficial28

Best for: Gemini-native workflows only

Native to Gemini API. 5,000 free prompts/month. Gemini Deep Research preview: 59.2% BrowseComp, 46.4% HLE (highest reported).

⚡ Platform lock-in (Gemini API only). Most expensive ($14/1K queries). Not a standalone search API. No MCP server. Deep Research still in preview.

See the full comparison.

Stars, downloads, evidence — all skills side by side.

COMPARE →

Skills comparison

GitHub stars and evidence count for top ranked skills.

GitHub Stars

Evidence items

Strong

Moderate

+8 more not shown

Star growth over time

GitHub stars trajectory for top skills in this category.

GitHub Stars

Firecrawl

Head to head

Brave SearchvsExa

Brave wins on speed (669ms vs ~1,200ms), benchmark score (14.89 vs 14.39), MCP breadth (6 tools vs 1), and free tier (~1,000/mo). Exa wins on semantic depth (neural embeddings, people/company/code verticals), MCP adoption (4,031 vs 784 stars), HN traction (412 vs 95 pts), and weekly downloads (954K vs 24K npm). Use Brave as the default; switch to Exa when you need meaning-based search or vertical lookups.

Brave SearchvsTavily

AIMultiple benchmark: ~1pt gap (14.89 vs 13.67) described as 'meaningful, not random.' Brave has independent index; Tavily wraps external sources. Tavily acquired by Nebius — future direction uncertain. Tavily wins on response time (187ms vs 669ms) and uptime (99.94%). Brave is objectively stronger on evidence; Tavily persists on LangChain ecosystem inertia.

FirecrawlvsCrawl4AI

Firecrawl wins on features (search+scrape+agent), enterprise compliance, multi-language SDKs, benchmark score (14.58), and success rate (95.3% vs 89.7%). Crawl4AI wins on license (Apache-2.0 vs AGPL), cost (free), and local LLM support. Crawl4AI development stalled since Jan 2026 — yellow flag. Three independent reviews reach same conclusion.

ExavsTavily

Exa wins on quality (14.39 vs 13.67 Agent Score, 81% vs 71% complex retrieval, 96% vs 85% citations) and proprietary neural index. Tavily wins on distribution (1.28M weekly downloads, LangChain default) and response time (187ms). Exa is the better tool; Tavily is the more convenient one. Tavily's Nebius acquisition adds risk.

FirecrawlvsJina Reader

Firecrawl does everything Jina Reader does, plus search, structured extraction, batch processing, and agent endpoint. Jina Reader's OSS repo is stale (no commits for 10+ months). Firecrawl is the superset choice for new projects. Firecrawl 4-5x cheaper at volume.

SearXNGvsBrave Search

SearXNG: free, self-hosted, private, 70+ aggregated engines. Brave: higher quality (14.89 benchmark), faster (669ms), managed, SOC 2. Privacy vs quality tradeoff. SearXNG is the only option for teams that can't send queries to third-party APIs.

Parallel AIvsPerplexity Sonar

Both serve deep research. Parallel: 48% BrowseComp vs Perplexity 8%. Both slow (13,600ms vs 11,000ms+). Parallel delivers on depth; Perplexity has higher raw accuracy (87% HumAI) but truncation issues. Parallel wins on deep research quality if you can tolerate latency and cost.

Missing a contender?

If there's a skill we haven't ranked, submit it.

SUBMIT A SKILL →

Public signals

benchmark2026-02

AIMultiple 2026 Agentic Search Benchmark — Brave #1, Firecrawl #2, Exa #3

100 queries, GPT-5.2 judge, bootstrap resampling with 10K resamples for 95% CI. Top 4 (Brave, Firecrawl, Exa, Parallel Pro) statistically indistinguishable. Brave-Tavily gap (~1pt) is 'meaningful, not random.'

acquisition2026-02-10

Nebius acquires Tavily for $275M (up to $400M with milestones)

Bloomberg-confirmed. API unchanged post-acquisition. No pricing changes yet — watch Q2 for Nebius integration shifts.

traction2026-03-17

Brave MCP Server — 784 stars, 24K npm weekly downloads, SOC 2 Type II

784 stars, 24K npm/wk. 6 tools, tool whitelisting/blacklisting, HTTP + stdio transport. SOC 2 Type II attested (Oct 2025). 35K+ API customers, 22 platform integrations.

traction2026-03-18

Firecrawl MCP server hits 5,802 stars — highest MCP adoption in search/scrape

5,802 MCP stars + 94,850 main repo stars. 1.17M combined weekly downloads (752K PyPI + 422K npm). Java SDK (Mar 2026), Rust PDF parser (Feb 2026), Browser Sandbox (Feb 2026), parallel agents (v2.8.0). Daily commits.

benchmark2026-02

AIMultiple 2026 Browser MCP Benchmark — Bright Data #1 (100%), Firecrawl #3 (7s fastest)

8 MCP servers, 4 tasks × 5 runs, LangGraph + Claude 3.5 Sonnet. Bright Data: 100% extraction, 90% automation, 77% scalability. Firecrawl: 83% extraction, fastest speed (7s). Hyperbrowser: 90% automation. Exa: 23% extraction (best for search, not scrape).

traction2026-03-18

Exa MCP: 4,031 stars, ~938K weekly downloads, Exa Instant sub-200ms (Feb 2026)

4,031 MCP stars. 668K PyPI + 270K npm = ~938K weekly. SOC 2 Type II. Exa Instant sub-200ms (announced Feb 13, 2026 — self-reported, not independently verified). Exa Fast <350ms P50. Exa Deep ~3.5s P50.

funding2025-09-03

Exa raises $85M Series B at $700M valuation, Nvidia NVentures backing

Infrastructure-grade credibility. Exa serves web search to thousands of companies including Cursor. Feb 2026: Exa Fast (sub-500ms) and Exa Deep (agentic) launched.

comparison2026

HumAI: Perplexity vs Tavily vs Exa vs You.com complete comparison 2026

Independent comparison. Perplexity highest accuracy (87%), Tavily fastest (187ms, 99.94% uptime), Exa best citations (96%). Exa vs Tavily: 81% vs 71% on complex retrieval.

comparison2026-01

Three independent Crawl4AI vs Firecrawl comparisons + Bright Data benchmark

Capsolver, Bright Data, Apify all agree: Firecrawl 95.3% success vs Crawl4AI 89.7%. Crawl4AI wins on license/cost. Crawl4AI development stalled since Jan 2026.

funding2026-02

Linkup raises $10M seed — AI-native search API (Gradient, Mistral/Datadog founders)

New contender. Angels include Datadog CEO and Mistral CEO. Customers include KPMG. Sub-second /fast endpoint. Below cut line — too early.

traction2026-03

Bright Data MCP: 2,213 stars, 60+ MCP tools, free tier — NEW contender

Industrial-grade anti-bot (CAPTCHA, proxies, geo-unblocking). Not a search API — web access infrastructure. Only option for sites with aggressive bot defenses.

watch2026-03

Gemini Deep Research API in preview — 59.2% BrowseComp, highest HLE (46.4%)

Best BrowseComp and HLE scores reported, but preview status, locked to Gemini API, no MCP server, no independent verification. If exits preview with MCP support, deep research lane shifts.

What changes this

If Tavily's Nebius acquisition leads to pricing changes or API shifts in Q2 2026, it drops. If they invest heavily (independent index, benchmark improvements), it could reclaim top 3.

If Parallel AI's BrowseComp claims are independently validated, it enters the ranked list as #1 deep research tool (a new lane, not displacing Brave for standard search).

If Exa Instant sub-180ms latency is independently verified, Exa takes the 'fastest' crown from Brave and the #1/#3 gap narrative changes. Could challenge for #1.

If Jina Reader resumes meaningful OSS development (MCP server, structured extraction), it differentiates from Firecrawl. Without activity, on track for delisting by mid-2026.

If Crawl4AI development resumes and ships v1.0, reconsider. If no code commits by mid-2026, delist.

Gemini Deep Research API is now in preview (59.2% BrowseComp, 46.4% HLE). If it exits preview with MCP support or open API access, deep research lane shifts — Parallel and Perplexity Sonar lose their reason to exist as standalone products.

If You.com's benchmark claims (93% SimpleQA, #1 DeepSearchQA) are independently verified, it enters ranked list on OpenAI distribution moat alone. Could land #4-5.

If SearXNG gets independent benchmark inclusion, it either validates as competitive or gets exposed as lower quality. Currently ungradeable on search quality.

If Linkup or Airweave gains >10K stars and benchmark results, they enter the ranked list.