Very high engagement on initial launch. Community validated the vision-LLM approach for browser automation.
Skyvern
activeVision-LLM browser automation for enterprise workflows. Combines computer vision with LLM reasoning to handle websites never seen before. YC S23 backed with CAPTCHA solving, 2FA, and proxy networks.
80/100
Trust
21K+
Stars
6
Evidence
492.8 MB
Repo size
Videos
Reviews, tutorials, and comparisons from the community.
This Browser Agent Automates ANYTHING (N8N + Skyvern)
Repo health
14h ago
Last push
151
Open issues
1,855
Forks
83
Contributors
Editorial verdict
Best pick for enterprise workflow automation on websites without APIs — form filling, data entry, procurement. Overkill for developer/coding agent browser tasks.
Source
GitHub: Skyvern-AI/skyvern
Docs: docs.skyvern.com
Public evidence
Second major HN appearance with sustained community interest. YC S23 batch backing adds institutional credibility.
Skyvern scored 85.85% on the WebVoyager benchmark. Solid but below Browser Use (89.1%). Validates the vision-LLM approach for enterprise automation.
Now at v1.x (production-ready). AGPL-3.0 license. Active weekly releases. Enterprise-focused with CAPTCHA, 2FA, proxy support.
Balanced review: strengths in vision+LLM approach and natural language automation. Weaknesses in pricing opacity, steep learning curve, and AGPL license.
Skyvern achieved 64.4% success rate vs Browser Use Cloud's 43.9%. Skyvern won on reliability while Browser Use won on speed (2x faster) and cost (2x cheaper per task).
How does this compare?
See side-by-side metrics against other skills in the same category.
Where it wins
Vision-LLM approach — handles websites never seen before, resilient to layout changes
Enterprise features: CAPTCHA solving, 2FA handling, proxy networks, geo-targeting
Multi-step workflow engine for complex business processes
YC S23 backed with $2.7M raised
Where to be skeptical
AGPL-3.0 license limits commercial use
Enterprise/RPA focus — overkill for coding agent browser tasks
Python-only
Pricing opacity noted by independent reviewers
Ranking in categories
Know a better alternative?
Submit evidence and we'll run the full pipeline.
Similar skills

Chrome DevTools MCP
86Google Chrome team's official MCP server for Chrome DevTools. Gives coding agents deep debugging, performance profiling, and Core Web Vitals analysis through 26 tools across 6 categories.

Playwright MCP
84Microsoft's official MCP server for Playwright. Uses accessibility snapshots instead of screenshots for structured browser control. Auto-configured in GitHub Copilot's Coding Agent.
Vercel Agent Browser
80Token-efficient browser automation CLI for AI agents. Rust core with sub-50ms boot. Claims 93% context reduction vs Playwright MCP through ref-based element selection on accessibility snapshots.

Browser Use
76Python library for controlling a real browser with vision and DOM extraction, built for agent workflows.
Raw GitHub source
GitHub README peek
Constrained peek so you can sanity-check the source material without leaving the site.
Skyvern automates browser-based workflows using LLMs and computer vision. It provides a Playwright-compatible SDK that adds AI functionality on top of playwright, as well as a no-code workflow builder to help both technical and non-technical users automate manual workflows on any website, replacing brittle or unreliable automation solutions.
<p align="center"> <img src="https://raw.githubusercontent.com/Skyvern-AI/skyvern/main/fern/images/geico_shu_recording_cropped.gif"/> </p>Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
Instead of only relying on code-defined XPath interactions, Skyvern relies on Vision LLMs to learn and interact with the websites.
How it works
Skyvern was inspired by the Task-Driven autonomous agent design popularized by BabyAGI and AutoGPT -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like Playwright.
Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:
<picture> <source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_2_0_system_diagram.png" /> <img src="https://raw.githubusercontent.com/Skyvern-AI/skyvern/main/fern/images/skyvern_2_0_system_diagram.png" /> </picture>This approach has a few advantages:
- Skyvern can operate on websites it's never seen before, as it's able to map visual elements to actions necessary to complete a workflow, without any customized code
- Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
- Skyvern is able to take a single workflow and apply it to a large number of websites, as it's able to reason through the interactions necessary to complete the workflow A detailed technical report can be found here.
Demo
<!-- Redo demo -->https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f
Quickstart
Skyvern Cloud
Skyvern Cloud is a managed cloud version of Skyvern that allows you to run Skyvern without worrying about the infrastructure. It allows you to run multiple Skyvern instances in parallel and comes bundled with anti-bot detection mechanisms, proxy network, and CAPTCHA solvers.
If you'd like to try it out, navigate to app.skyvern.com and create an account.
Run Locally (UI + Server)
Choose your preferred setup method:
Option A: pip install (Recommended)
Dependencies needed:
- Python 3.11.x, works with 3.12, not ready yet for 3.13
- NodeJS & NPM
Additionally, for Windows:
- Rust
- VS Code with C++ dev tools and Windows SDK
1. Install Skyvern
pip install skyvern
2. Run Skyvern
skyvern quickstart
Option B: Docker Compose
- Install Docker Desktop
- Clone the repository:
git clone https://github.com/skyvern-ai/skyvern.git && cd skyvern - Run quickstart with Docker Compose:
When prompted, choose "Docker Compose" for the full containerized setup.pip install skyvern && skyvern quickstart - Navigate to http://localhost:8080
SDK
Skyvern is a Playwright extension that adds AI-powered browser automation. It gives you the full power of Playwright with additional AI capabilities—use natural language prompts to interact with elements, extract data, and automate complex multi-step workflows.
Installation:
- Python:
pip install skyvernthen runskyvern quickstartfor local setup - TypeScript:
npm install @skyvern/client
AI-Powered Page Commands
Skyvern adds four core AI commands directly on the page object:
| Command | Description |
|---|---|
page.act(prompt) | Perform actions using natural language (e.g., "Click the login button") |
page.extract(prompt, schema) | Extract structured data from the page with optional JSON schema |
page.validate(prompt) | Validate page state, returns bool (e.g., "Check if user is logged in") |
page.prompt(prompt, schema) | Send arbitrary prompts to the LLM with optional response schema |
Additionally, page.agent provides higher-level workflow commands:
| Command | Description |
|---|---|
page.agent.run_task(prompt) | Execute complex multi-step tasks |
page.agent.login(credential_type, credential_id) | Authenticate with stored credentials (Skyvern, Bitwarden, 1Password) |
page.agent.download_files(prompt) | Navigate and download files |
page.agent.run_workflow(workflow_id) | Execute pre-built workflows |
AI-Augmented Playwright Actions
All standard Playwright actions support an optional prompt parameter for AI-powered element location:
| Action | Playwright | AI-Augmented |
|---|---|---|
| Click | page.click("#btn") | page.click(prompt="Click login button") |
| Fill | page.fill("#email", "a@b.com") | page.fill(prompt="Email field", value="a@b.com") |
| Select | page.select_option("#country", "US") | page.select_option(prompt="Country dropdown", value="US") |
| Upload | page.upload_file("#file", "doc.pdf") | page.upload_file(prompt="Upload area", files="doc.pdf") |
Three interaction modes:
# 1. Traditional Playwright - CSS/XPath selectors
await page.click("#submit-button")
# 2. AI-powered - natural language