PandasAI

stale

Conversational data analysis on pandas DataFrames. 23.4K stars but 5 months without commits (last push Oct 2025). Documented hallucination and black-box output problems.

Score 72stale

Where it wins

High name recognition — 23.4K stars, 263K monthly downloads

Simple API: ask questions about your data in natural language

112 contributors built up over the project's history

Where to be skeptical

5 months without a single commit (last push Oct 2025) — biggest red flag in the category

Documented hallucination: 'LLM sometimes generates a fake dataset that looks like yours and analyzes that instead'

Black-box outputs: doesn't show generated code alongside answers

Only uses first 5 rows for structure inference

Custom license (not Apache/MIT) — less trustworthy for production use

Only 77 pts on HN (single story) despite being the most-starred conversational data tool

Downloads (263K/mo) are 7x lower than Marimo despite having more stars — declining relevance signal

Editorial verdict

High name recognition but stale and problematic. 5 months without commits, documented hallucination risk, custom license. Good for quick throwaway exploration only — not suitable for anything where correctness matters.

Source

GitHub: sinaptik-ai/pandas-ai

Found via SkillPack? ★ Star us on GitHub

Videos

Reviews, tutorials, and comparisons from the community.

MySQL with PandasAI

Tirendaz Academy·2024

PandasAI Agent

Tirendaz Academy·2024

Introduction to Pandas AI

Tirendaz Academy·2024

Data & Analytics

#07of 7

Quick, throwaway data exploration where you accept hallucination risk — not for anything where correctness matters

Marimo

Reactive Python notebook that replaces Jupyter. Pure .py files, reactive DAG execution, dual-mode (notebook → app). 19.8K stars, 1.9M monthly PyPI downloads, 261 contributors.

Streamlit

The dominant Python data app framework. 44K stars, 31.8M monthly PyPI downloads, acquired by Snowflake for $800M. Ecosystem giant for deploying data apps — the standard answer for sharing Python analysis as a web app.

Observable Framework

Static site generator for data apps with D3.js lineage. Full web dev power (HTML, CSS, JS, React). 3.4K stars, 16.7K npm monthly downloads, 360 pts on HN.

Data Formulator

AI-powered data visualization tool from Microsoft Research. Interactive AI agents iterate on chart design from raw data. 15.1K stars, MIT license, very active development.

Public evidence

strong2024-2025

Testing the Limits of PandasAI — detailed limitations analysis

LLM sometimes generates a fake dataset that looks like yours and analyzes that instead. Hallucination risk confirmed by independent testing.

Independent data science publicationFuture Proof DS (independent)

moderate2024-2025

Black box problem documented across multiple sources

No code output alongside answers; only uses first 5 rows for structure inference; complex multi-part queries get misinterpreted.

Multiple independent sourcesRestack.io, Medium/@daniele.ongari (independent)

Raw GitHub source

GitHub README peek

Constrained peek so you can sanity-check the source material without leaving the site.

Discord

PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps non-technical users to interact with their data in a more natural way, and it helps technical users to save time, and effort when working with data.

🔧 Getting started

You can find the full documentation for PandasAI here.

📚 Using the library

Python Requirements

Python version 3.8+ <=3.11

📦 Installation

You can install the PandasAI library using pip or poetry.

With pip:

pip install pandasai
pip install pandasai-litellm

With poetry:

poetry add pandasai
poetry add pandasai-litellm

💻 Usage

Ask questions

import pandasai as pai
from pandasai_litellm.litellm import LiteLLM

# Initialize LiteLLM with your OpenAI model
llm = LiteLLM(model="gpt-4.1-mini", api_key="YOUR_OPENAI_API_KEY")

# Configure PandasAI to use this LLM
pai.config.set({
    "llm": llm
})

# Load your data
df = pai.read_csv("data/companies.csv")

response = df.chat("What is the average revenue by region?")
print(response)

Or you can ask more complex questions:

df.chat(
    "What is the total sales for the top 3 countries by sales?"
)

The total sales for the top 3 countries by sales is 16500.

Visualize charts

You can also ask PandasAI to generate charts for you:

df.chat(
    "Plot the histogram of countries showing for each one the gdp. Use different colors for each bar",
)

Chart

Multiple DataFrames

You can also pass in multiple dataframes to PandasAI and ask questions relating them.

import pandasai as pai
from pandasai_litellm.litellm import LiteLLM

# Initialize LiteLLM with your OpenAI model
llm = LiteLLM(model="gpt-4.1-mini", api_key="YOUR_OPENAI_API_KEY")

# Configure PandasAI to use this LLM
pai.config.set({
    "llm": llm
})

employees_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
    'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}

salaries_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Salary': [5000, 6000, 4500, 7000, 5500]
}

employees_df = pai.DataFrame(employees_data)
salaries_df = pai.DataFrame(salaries_data)


pai.chat("Who gets paid the most?", employees_df, salaries_df)

Olivia gets paid the most.

Docker Sandbox

You can run PandasAI in a Docker sandbox, providing a secure, isolated environment to execute code safely and mitigate the risk of malicious attacks.

Python Requirements

pip install "pandasai-docker"

Usage

import pandasai as pai

View on GitHub →