browser-use

Name: browser-use
Author: vm0-ai/vm0-skills

$npx mdskill add vm0-ai/vm0-skills/browser-use

Execute AI agents in browsers to automate complex web tasks.

Handles natural language commands for multi-step web interactions.
Depends on Browser Use Cloud API for hosted browser execution.
Decides actions by interpreting user intent into browser steps.
Delivers results via task completion status and live session streams.

SKILL.md

.github/skills/browser-useView on GitHub ↗

---
name: browser-use
description: Browser Use Cloud API for AI-powered browser automation. Use when user mentions "Browser Use", "browser automation", "web task", "AI agent browser", "run browser task", or "automated browser session".
---

# Browser Use

Browser Use Cloud runs AI agents in hosted browsers to complete web tasks. You submit a natural-language task prompt; the agent navigates the browser, interacts with pages, and returns a result.

> Official docs: `https://docs.browser-use.com/cloud/api-reference`

---

## When to Use

Use this skill when you need to:

- Run an AI agent to complete a web task (e.g. "search for X and return results")
- Automate multi-step browser interactions via natural language
- Check task status or retrieve results from a completed browser session
- View a live stream of an agent working in a browser
- Manage browser sessions and billing

---

## Prerequisites

Connect the **Browser Use** connector at [app.vm0.ai/connectors](https://app.vm0.ai/connectors).

> **Troubleshooting:** If requests fail, run `zero doctor check-connector --env-name BROWSER_USE_TOKEN` or `zero doctor check-connector --url https://api.browser-use.com/api/v2/tasks --method GET`

---

## How to Use

### 1. Run a Task

Submit a natural-language task. The agent will open a browser and complete it.

Write to `/tmp/browser_use_task.json`:

```json
{
  "task": "Search for the top Hacker News post and return the title and URL."
}
```

```bash
curl -s -X POST "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d @/tmp/browser_use_task.json
```

Returns `{"id": "<task-id>", "sessionId": "<session-id>"}`.

### 2. Get Task Status and Result

Poll until `status` is `"finished"` or `"failed"`. The `output` field contains the agent's result.

```bash
curl -s "https://api.browser-use.com/api/v2/tasks/<task-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

Task `status` values: `created`, `running`, `paused`, `finished`, `failed`, `stopped`.

### 3. Watch a Live Session

Get the live browser stream URL from the session.

```bash
curl -s "https://api.browser-use.com/api/v2/sessions/<session-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

The response includes `"liveUrl"` — open it in a browser to watch the agent in real time.

### 4. Stop a Session

```bash
curl -s -X PATCH "https://api.browser-use.com/api/v2/sessions/<session-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d '{"action": "stop"}'
```

### 5. List Tasks

```bash
curl -s "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

Optional query params: `pageSize`, `pageNumber`, `sessionId`, `status`.

### 6. Run a Task with Advanced Settings

Customize the model, timeouts, and browser behavior.

Write to `/tmp/browser_use_task.json`:

```json
{
  "task": "Go to linkedin.com and find the CEO of Anthropic.",
  "browserSettings": {
    "viewport": {
      "width": 1280,
      "height": 800
    }
  },
  "maxSteps": 30,
  "useVision": true
}
```

```bash
curl -s -X POST "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d @/tmp/browser_use_task.json
```

### 7. Get Account Billing

Check credit balances and plan information.

```bash
curl -s "https://api.browser-use.com/api/v2/billing/account" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

Returns `monthlyCreditsBalanceUsd`, `additionalCreditsBalanceUsd`, `totalCreditsBalanceUsd`, and plan details.

---

## Guidelines

1. **Poll for results**: Tasks are async — poll `GET /api/v2/tasks/<task-id>` until `status` is `finished` or `failed`.
2. **Sessions persist**: After a task finishes, the session stays open. Stop it explicitly with PATCH if you don't need it.
3. **Task writing**: Clear, specific task prompts produce better results. Include the exact URL if applicable.
4. **Credits**: Each task consumes credits. Check balance via the billing endpoint before running large batches.
5. **Session limit**: Sessions are capped at 15 minutes of total runtime.

More from vm0-ai/vm0-skills

Skill	Description
account-reconciliation	Perform account reconciliations comparing general ledger balances against subledgers, bank statements, or external records. Use for bank reconciliation, GL-to-subledger reconciliation, intercompany reconciliation, balance sheet reconciliation, reconciling item analysis, outstanding item aging, or clearing open items.
agentphone	Build AI phone agents with AgentPhone API. Use when the user wants to make phone calls, send/receive SMS, manage phone numbers, create voice agents, set up webhooks, or check usage — anything related to telephony, phone numbers, or voice AI.
ahrefs	Ahrefs SEO API for backlink and keyword analysis. Use when user mentions
amplitude	Amplitude product analytics API. Use when user mentions "Amplitude",
analysis-qa	Quality-check a data analysis before sharing — verify joins, aggregations, denominators, time ranges, and metric definitions. Detect pitfalls like survivorship bias, average-of-averages, join explosion, timezone mismatches, incomplete periods, and selection bias. Includes documentation templates for reproducible analyses.
anthropic-managed-agents	Anthropic Managed Agents API for programmatically creating, running, and streaming AI agents on Anthropic's cloud infrastructure. Use when the user mentions "Managed Agents", "Anthropic agent sessions", or needs to create/run/stream an Anthropic agent with tool use (bash, git, web), attach GitHub repositories, or inject secrets via Vault. Do NOT use for standard Claude Messages API — use the Claude API skill instead.
apify	Apify web scraping platform. Use when user mentions "scrape website",
asana	Asana API for tasks and projects. Use when user mentions "Asana", "asana.com",
atlassian	Atlassian API for Confluence and Jira. Use when user mentions "Confluence
attio	Attio REST API for AI-native CRM operations — manage companies, people, deals, and custom objects, plus notes, tasks, lists, and comments. Use when the user mentions "Attio", "CRM record", "create company", "add person", "list entry", "CRM note", or "CRM task".