skillpack.co
All solutions

OpenAI Deep Research

active

Agentic research mode powered by o3/o4-mini. 26.6% HLE (highest of any system), 72.57% GAIA, MCP support (Feb 2026). Slower (3-15 min) but deepest reasoning.

Score 42
OpenAI Deep Research in action

Where it wins

26.6% on Humanity's Last Exam — highest of any system (Helicone)

72.57% GAIA benchmark — best reported result

MCP support added Feb 2026 — first major platform to do so

API available via /responses endpoint with deep research models

Domain-restricted searches, real-time progress tracking

Where to be skeptical

3-15 min per query — 10-60x slower than Perplexity

$200/mo for unlimited (Pro); Plus ($20/mo) limited to 10 queries/mo

Closed source, no self-hosting option

DeepResearch Bench: o3 standalone outperformed Deep Research mode in some evals (FutureSearch)

Editorial verdict

#2 in research. Reasoning king — 26.6% HLE and 72.57% GAIA are best reported results from any system. MCP support (Feb 2026) enables enterprise toolchain integration. Slower and pricier than Perplexity but unmatched for PhD-level questions.

Related

Public evidence

Raw GitHub source

GitHub README could not be fetched right now.