skillpack.co
All skills

Skyvern

active

Vision-LLM browser automation for enterprise workflows. Combines computer vision with LLM reasoning to handle websites never seen before. YC S23 backed with CAPTCHA solving, 2FA, and proxy networks.

Connector
Composite
Complexity
browserweb

80/100

Trust

21K+

Stars

6

Evidence

492.8 MB

Repo size

Videos

Reviews, tutorials, and comparisons from the community.

This Browser Agent Automates ANYTHING (N8N + Skyvern)

Ben AI·2025-02-11

Repo health

80/100

14h ago

Last push

151

Open issues

1,855

Forks

83

Contributors

Editorial verdict

Best pick for enterprise workflow automation on websites without APIs — form filling, data entry, procurement. Overkill for developer/coding agent browser tasks.

Public evidence

moderate2026-03
WebVoyager benchmark: 85.85% (Steel.dev leaderboard)

Skyvern scored 85.85% on the WebVoyager benchmark. Solid but below Browser Use (89.1%). Validates the vision-LLM approach for enterprise automation.

Independent benchmark leaderboardSteel.dev (independent)
moderate2026-03
Automateed.com independent review: 7/10 rating

Balanced review: strengths in vision+LLM approach and natural language automation. Weaknesses in pricing opacity, steep learning curve, and AGPL license.

Independent review siteAutomateed.com (independent)

How does this compare?

See side-by-side metrics against other skills in the same category.

COMPARE SKILLS →

Where it wins

Vision-LLM approach — handles websites never seen before, resilient to layout changes

Enterprise features: CAPTCHA solving, 2FA handling, proxy networks, geo-targeting

Multi-step workflow engine for complex business processes

YC S23 backed with $2.7M raised

Where to be skeptical

AGPL-3.0 license limits commercial use

Enterprise/RPA focus — overkill for coding agent browser tasks

Python-only

Pricing opacity noted by independent reviewers

Ranking in categories

Know a better alternative?

Submit evidence and we'll run the full pipeline.

SUBMIT →

Similar skills

Raw GitHub source

GitHub README peek

Constrained peek so you can sanity-check the source material without leaving the site.

<!-- DOCTOC SKIP --> <h1 align="center"> <a href="https://www.skyvern.com"> <picture> <source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_logo.png"/> <img height="120" src="https://raw.githubusercontent.com/Skyvern-AI/skyvern/main/fern/images/skyvern_logo_blackbg.png"/> </picture> </a> <br /> </h1> <p align="center"> 🐉 Automate Browser-based workflows using LLMs and Computer Vision 🐉 </p> <p align="center"> </p>

Skyvern automates browser-based workflows using LLMs and computer vision. It provides a Playwright-compatible SDK that adds AI functionality on top of playwright, as well as a no-code workflow builder to help both technical and non-technical users automate manual workflows on any website, replacing brittle or unreliable automation solutions.

<p align="center"> <img src="https://raw.githubusercontent.com/Skyvern-AI/skyvern/main/fern/images/geico_shu_recording_cropped.gif"/> </p>

Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.

Instead of only relying on code-defined XPath interactions, Skyvern relies on Vision LLMs to learn and interact with the websites.

How it works

Skyvern was inspired by the Task-Driven autonomous agent design popularized by BabyAGI and AutoGPT -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like Playwright.

Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:

<picture> <source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_2_0_system_diagram.png" /> <img src="https://raw.githubusercontent.com/Skyvern-AI/skyvern/main/fern/images/skyvern_2_0_system_diagram.png" /> </picture>

This approach has a few advantages:

  1. Skyvern can operate on websites it's never seen before, as it's able to map visual elements to actions necessary to complete a workflow, without any customized code
  2. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
  3. Skyvern is able to take a single workflow and apply it to a large number of websites, as it's able to reason through the interactions necessary to complete the workflow A detailed technical report can be found here.

Demo

<!-- Redo demo -->

https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f

Quickstart

Skyvern Cloud

Skyvern Cloud is a managed cloud version of Skyvern that allows you to run Skyvern without worrying about the infrastructure. It allows you to run multiple Skyvern instances in parallel and comes bundled with anti-bot detection mechanisms, proxy network, and CAPTCHA solvers.

If you'd like to try it out, navigate to app.skyvern.com and create an account.

Run Locally (UI + Server)

Choose your preferred setup method:

Option A: pip install (Recommended)

Dependencies needed:

  • Python 3.11.x, works with 3.12, not ready yet for 3.13
  • NodeJS & NPM

Additionally, for Windows:

  • Rust
  • VS Code with C++ dev tools and Windows SDK

1. Install Skyvern

pip install skyvern

2. Run Skyvern

skyvern quickstart
Option B: Docker Compose
  1. Install Docker Desktop
  2. Clone the repository:
    git clone https://github.com/skyvern-ai/skyvern.git && cd skyvern
    
  3. Run quickstart with Docker Compose:
    pip install skyvern && skyvern quickstart
    
    When prompted, choose "Docker Compose" for the full containerized setup.
  4. Navigate to http://localhost:8080

SDK

Skyvern is a Playwright extension that adds AI-powered browser automation. It gives you the full power of Playwright with additional AI capabilities—use natural language prompts to interact with elements, extract data, and automate complex multi-step workflows.

Installation:

  • Python: pip install skyvern then run skyvern quickstart for local setup
  • TypeScript: npm install @skyvern/client
AI-Powered Page Commands

Skyvern adds four core AI commands directly on the page object:

CommandDescription
page.act(prompt)Perform actions using natural language (e.g., "Click the login button")
page.extract(prompt, schema)Extract structured data from the page with optional JSON schema
page.validate(prompt)Validate page state, returns bool (e.g., "Check if user is logged in")
page.prompt(prompt, schema)Send arbitrary prompts to the LLM with optional response schema

Additionally, page.agent provides higher-level workflow commands:

CommandDescription
page.agent.run_task(prompt)Execute complex multi-step tasks
page.agent.login(credential_type, credential_id)Authenticate with stored credentials (Skyvern, Bitwarden, 1Password)
page.agent.download_files(prompt)Navigate and download files
page.agent.run_workflow(workflow_id)Execute pre-built workflows
AI-Augmented Playwright Actions

All standard Playwright actions support an optional prompt parameter for AI-powered element location:

ActionPlaywrightAI-Augmented
Clickpage.click("#btn")page.click(prompt="Click login button")
Fillpage.fill("#email", "a@b.com")page.fill(prompt="Email field", value="a@b.com")
Selectpage.select_option("#country", "US")page.select_option(prompt="Country dropdown", value="US")
Uploadpage.upload_file("#file", "doc.pdf")page.upload_file(prompt="Upload area", files="doc.pdf")

Three interaction modes:

# 1. Traditional Playwright - CSS/XPath selectors
await page.click("#submit-button")

# 2. AI-powered - natural language
View on GitHub →