LLM sometimes generates a fake dataset that looks like yours and analyzes that instead. Hallucination risk confirmed by independent testing.
PandasAI
staleConversational data analysis on pandas DataFrames. 23.4K stars but 5 months without commits (last push Oct 2025). Documented hallucination and black-box output problems.

Where it wins
High name recognition — 23.4K stars, 263K monthly downloads
Simple API: ask questions about your data in natural language
112 contributors built up over the project's history
Where to be skeptical
5 months without a single commit (last push Oct 2025) — biggest red flag in the category
Documented hallucination: 'LLM sometimes generates a fake dataset that looks like yours and analyzes that instead'
Black-box outputs: doesn't show generated code alongside answers
Only uses first 5 rows for structure inference
Custom license (not Apache/MIT) — less trustworthy for production use
Only 77 pts on HN (single story) despite being the most-starred conversational data tool
Downloads (263K/mo) are 7x lower than Marimo despite having more stars — declining relevance signal
Editorial verdict
High name recognition but stale and problematic. 5 months without commits, documented hallucination risk, custom license. Good for quick throwaway exploration only — not suitable for anything where correctness matters.
Videos
Reviews, tutorials, and comparisons from the community.
MySQL with PandasAI
PandasAI Agent
Introduction to Pandas AI
Related

Marimo
93Reactive Python notebook that replaces Jupyter. Pure .py files, reactive DAG execution, dual-mode (notebook → app). 19.8K stars, 1.9M monthly PyPI downloads, 261 contributors.

Streamlit
90The dominant Python data app framework. 44K stars, 31.8M monthly PyPI downloads, acquired by Snowflake for $800M. Ecosystem giant for deploying data apps — the standard answer for sharing Python analysis as a web app.

Observable Framework
84Static site generator for data apps with D3.js lineage. Full web dev power (HTML, CSS, JS, React). 3.4K stars, 16.7K npm monthly downloads, 360 pts on HN.

Data Formulator
82AI-powered data visualization tool from Microsoft Research. Interactive AI agents iterate on chart design from raw data. 15.1K stars, MIT license, very active development.
Public evidence
No code output alongside answers; only uses first 5 rows for structure inference; complex multi-part queries get misinterpreted.
Raw GitHub source
GitHub README peek
Constrained peek so you can sanity-check the source material without leaving the site.
PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps non-technical users to interact with their data in a more natural way, and it helps technical users to save time, and effort when working with data.
🔧 Getting started
You can find the full documentation for PandasAI here.
📚 Using the library
Python Requirements
Python version 3.8+ <=3.11
📦 Installation
You can install the PandasAI library using pip or poetry.
With pip:
pip install pandasai
pip install pandasai-litellm
With poetry:
poetry add pandasai
poetry add pandasai-litellm
💻 Usage
Ask questions
import pandasai as pai
from pandasai_litellm.litellm import LiteLLM
# Initialize LiteLLM with your OpenAI model
llm = LiteLLM(model="gpt-4.1-mini", api_key="YOUR_OPENAI_API_KEY")
# Configure PandasAI to use this LLM
pai.config.set({
"llm": llm
})
# Load your data
df = pai.read_csv("data/companies.csv")
response = df.chat("What is the average revenue by region?")
print(response)
Or you can ask more complex questions:
df.chat(
"What is the total sales for the top 3 countries by sales?"
)
The total sales for the top 3 countries by sales is 16500.
Visualize charts
You can also ask PandasAI to generate charts for you:
df.chat(
"Plot the histogram of countries showing for each one the gdp. Use different colors for each bar",
)

Multiple DataFrames
You can also pass in multiple dataframes to PandasAI and ask questions relating them.
import pandasai as pai
from pandasai_litellm.litellm import LiteLLM
# Initialize LiteLLM with your OpenAI model
llm = LiteLLM(model="gpt-4.1-mini", api_key="YOUR_OPENAI_API_KEY")
# Configure PandasAI to use this LLM
pai.config.set({
"llm": llm
})
employees_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}
salaries_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Salary': [5000, 6000, 4500, 7000, 5500]
}
employees_df = pai.DataFrame(employees_data)
salaries_df = pai.DataFrame(salaries_data)
pai.chat("Who gets paid the most?", employees_df, salaries_df)
Olivia gets paid the most.
Docker Sandbox
You can run PandasAI in a Docker sandbox, providing a secure, isolated environment to execute code safely and mitigate the risk of malicious attacks.
Python Requirements
pip install "pandasai-docker"
Usage
import pandasai as pai
