skillpack.co
All problems

Data & Analytics

AI-powered data analysis tools, reactive notebooks, BI-as-code platforms, conversational data agents, and ML training aids. The category has split into reactive notebooks (Marimo), AI visualization (Data Formulator), BI-as-code (Evidence, Observable), app deployment (Streamlit), conversational data (PandasAI), and prompt-to-ML (Plexe). Marimo is the clear #1 — strongest combined signal across stars, downloads, HN attention, and independent validation.

7

Ranked

4

Signals

Current ranking

1
Marimo93

Best for: Exploratory analysis with reproducibility guarantees — the Jupyter replacement for agent-assisted data workflows

19.8K stars, 1.9M monthly PyPI downloads, 448-pt HN peak (10 stories over 2 years). Independent switching stories from Towards Data Science and Oracle engineering. PyCon US 2025 talk. Reactive DAG execution, pure .py files, dual-mode (notebook → app). 261 contributors, daily commits.

2
Data Formulator82

Best for: AI-powered iterative data visualization — describe what you want, the agent builds and refines the chart

15.1K stars, Microsoft Research backing, MIT license. 212 pts on HN with sustained follow-up (38 pts). Very active development (pushed day before ranking). Fills a niche Marimo doesn't: conversational AI agents for visualization.

3
Evidence78

Best for: SQL-first analysts who want version-controlled, code-authored BI reports — no JS/Python required

6K stars, YC S21, 263-pt HN launch. SQL + Markdown → polished reports. Git-versioned. 76 contributors. The only tool in the ranking that doesn't require a general-purpose programming language.

4
Observable Framework84

Best for: Developer-built data dashboards with D3-quality visualization and static site deployment

3.4K stars, 16.7K npm/mo, 360 pts on HN (higher than Evidence). D3.js lineage — strongest viz pedigree. Static site generator — build once, deploy anywhere. Full web dev power (HTML, CSS, JS, React).

5
Streamlit90

Best for: Deploying data apps and dashboards to stakeholders — the standard for sharing Python analysis as a web app

44K stars, 31.8M monthly downloads, 319 contributors. Snowflake acquisition at $800M. 470-pt HN peak. The ecosystem giant for data app deployment.

6
Plexe75

Best for: Teams that need custom ML models but don't have ML engineering resources — describe and get a trained model

2.6K stars, YC X25, two HN stories (130 pts, 85 pts). Uniquely differentiated: natural language → trained ML model. Self-correcting ML engineering agents. Apache-2.0.

7
PandasAI72

Best for: Quick, throwaway data exploration where you accept hallucination risk — not for anything where correctness matters

23.4K stars, 263K monthly downloads, 112 contributors. Simple natural-language-to-pandas API. High name recognition.

Head to head

MarimovsPandasAI

Marimo wins on every quality signal: active dev (daily vs 5 months stale), downloads (1.9M vs 263K), HN (448 pts/10 stories vs 77 pts/1 story), reproducibility (guaranteed vs hallucination risk). PandasAI's star count is inflated.

MarimovsStreamlit

Different niches. Marimo for analysis→app (reactive, notebook-first). Streamlit for app-only deployment (full-script rerun). Marimo is the analysis engine; Streamlit is the deployment surface.

EvidencevsObservable Framework

Audience split. SQL analysts → Evidence (SQL + Markdown, no JS). Developers → Observable (HTML/CSS/JS, D3-quality viz). Evidence is simpler; Observable is more powerful.

Data FormulatorvsMarimo

Complementary, not competing. Data Formulator for AI-powered visualization iteration. Marimo for full data workflow. Different tools for different stages.

Public signals

What changes this

PandasAI ships a major release with active commits → could jump to #3–4 if hallucination issues are fixed

Marimo adds native AI agent integration (beyond editor assist) → cements #1, potentially absorbs Data Formulator's niche

Streamlit adds reactive execution (cell-level) → direct threat to Marimo's #1 position

Evidence adds AI/LLM-powered SQL generation → becomes obvious choice for SQL-first teams, could reach #2

Hex open-sources core notebook → would immediately enter at #3–4

Plexe gets independent technical reviews confirming accuracy → could jump to #4–5

A major data MCP server (BigQuery, Snowflake) gains standalone traction → new entry at #3–5