replicate

$npx mdskill add vm0-ai/vm0-skills/replicate

Run open-source ML models via Replicate's HTTP API

  • Generates images or text from hosted open-source models
  • Depends on Replicate API and requires a connector setup
  • Executes async jobs by submitting version IDs and inputs
  • Delivers results through output URLs from completed predictions

SKILL.md

.github/skills/replicateView on GitHub ↗
---
name: replicate
description: Replicate API for running open-source ML models in the cloud. Use when user mentions "Replicate", "run a model on Replicate", "AI image generation", "SDXL", "FLUX", "Llama", or "open-source ML inference".
---

# Replicate

Replicate lets you run open-source machine learning models via a simple HTTP API. Submit a prediction, poll until it completes, and retrieve the output URLs.

> Official docs: `https://replicate.com/docs/reference/http`

---

## When to Use

Use this skill when you need to:

- Generate images using SDXL, FLUX Schnell, or other diffusion models
- Run text generation with Llama or other open-source LLMs
- Execute any model hosted on replicate.com
- Poll the status of an async prediction job

---

## Prerequisites

Connect the **Replicate** connector at [app.vm0.ai/connectors](https://app.vm0.ai/connectors).

> **Troubleshooting:** If requests fail, run `zero doctor check-connector --env-name REPLICATE_TOKEN` or `zero doctor check-connector --url https://api.replicate.com/v1/models --method GET`

---

## How to Use

All predictions are asynchronous: submit a job, then poll until `status` is `succeeded` or `failed`.

### 1. Run a Model by Version ID

Write to `/tmp/replicate_prediction.json`:

```json
{
  "version": "<model-version-id>",
  "input": {
    "prompt": "A photorealistic cat sitting on a chair"
  }
}
```

```bash
curl -s -X POST "https://api.replicate.com/v1/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_prediction.json | jq '{id, status, urls}'
```

The response includes a prediction `id` and a `urls.get` URL for polling.

### 2. Run the Latest Version of a Model

Replace `<owner>` and `<model-name>` with the model's owner and name (e.g. `black-forest-labs` / `flux-schnell`).

Write to `/tmp/replicate_prediction.json`:

```json
{
  "input": {
    "prompt": "A photorealistic cat sitting on a chair"
  }
}
```

```bash
curl -s -X POST "https://api.replicate.com/v1/models/<owner>/<model-name>/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_prediction.json | jq '{id, status, urls}'
```

### 3. Poll Prediction Status

Replace `<prediction-id>` with the `id` from the create response.

```bash
curl -s "https://api.replicate.com/v1/predictions/<prediction-id>" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '{id, status, output, error}'
```

Keep polling every 2–5 seconds until `status` is `succeeded` or `failed`.

| `status`    | Meaning                                |
|-------------|----------------------------------------|
| `starting`  | Model is cold-starting                 |
| `processing`| Model is running                       |
| `succeeded` | Output is ready in the `output` field  |
| `failed`    | Check the `error` field for details    |
| `canceled`  | Prediction was canceled                |

### 4. Generate an Image with FLUX Schnell

Write to `/tmp/replicate_flux.json`:

```json
{
  "input": {
    "prompt": "A serene mountain lake at sunrise, photorealistic",
    "num_outputs": 1
  }
}
```

```bash
curl -s -X POST "https://api.replicate.com/v1/models/black-forest-labs/flux-schnell/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_flux.json | jq '{id, status, urls}'
```

### 5. Generate an Image with Stability AI SDXL

Write to `/tmp/replicate_sdxl.json`:

```json
{
  "input": {
    "prompt": "A cyberpunk cityscape at night, neon lights, 4k",
    "negative_prompt": "blurry, low quality",
    "num_outputs": 1,
    "width": 1024,
    "height": 1024
  }
}
```

```bash
curl -s -X POST "https://api.replicate.com/v1/models/stability-ai/sdxl/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_sdxl.json | jq '{id, status, urls}'
```

### 6. Run a Text Generation Model (Llama 3 70B)

Write to `/tmp/replicate_llama.json`:

```json
{
  "input": {
    "prompt": "Explain quantum entanglement in simple terms.",
    "max_tokens": 512
  }
}
```

```bash
curl -s -X POST "https://api.replicate.com/v1/models/meta/llama-3-70b-instruct/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_llama.json | jq '{id, status, urls}'
```

Text generation responses stream tokens as an array. Poll until `succeeded`, then read `output` (an array of strings — join them for the full response).

### 7. List Recent Predictions

```bash
curl -s "https://api.replicate.com/v1/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '.results[] | {id, status, created_at, urls}'
```

### 8. Search for Models

```bash
curl -s "https://api.replicate.com/v1/models" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '.results[] | {url, description}'
```

### 9. Get Model Details

Replace `<owner>/<model-name>` with the model identifier.

```bash
curl -s "https://api.replicate.com/v1/models/<owner>/<model-name>" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '{url, description, latest_version}'
```

### 10. Run via a Deployment

Replace `<deployment-owner>` and `<deployment-name>` with the deployment's owner and name.

Write to `/tmp/replicate_deploy.json`:

```json
{
  "input": {
    "prompt": "A futuristic robot in a garden"
  }
}
```

```bash
curl -s -X POST "https://api.replicate.com/v1/deployments/<deployment-owner>/<deployment-name>/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_deploy.json | jq '{id, status, urls}'
```

---

## Guidelines

1. **Always poll after submit**: Predictions are async. Never assume instant completion — always poll `GET /v1/predictions/<id>` until `status` is `succeeded` or `failed`.
2. **Poll interval**: 2–5 seconds is reasonable. Cold-starting models may take 30–60 seconds on the first prediction.
3. **Image output**: `output` will be an array of URLs (e.g. `["https://replicate.delivery/..."]`). Download with `curl -L`.
4. **Text output**: `output` is an array of token strings. Join them: `| jq '.output | join("")'`.
5. **Popular models**:
   - Image: `black-forest-labs/flux-schnell`, `stability-ai/sdxl`
   - Text: `meta/llama-3-70b-instruct`
6. **Version vs. latest**: Use `/v1/models/<owner>/<name>/predictions` to always run the latest version. Use `/v1/predictions` with a `version` ID to pin a specific version.

More from vm0-ai/vm0-skills

SkillDescription
account-reconciliationPerform account reconciliations comparing general ledger balances against subledgers, bank statements, or external records. Use for bank reconciliation, GL-to-subledger reconciliation, intercompany reconciliation, balance sheet reconciliation, reconciling item analysis, outstanding item aging, or clearing open items.
agentphoneBuild AI phone agents with AgentPhone API. Use when the user wants to make phone calls, send/receive SMS, manage phone numbers, create voice agents, set up webhooks, or check usage — anything related to telephony, phone numbers, or voice AI.
ahrefsAhrefs SEO API for backlink and keyword analysis. Use when user mentions
amplitudeAmplitude product analytics API. Use when user mentions "Amplitude",
analysis-qaQuality-check a data analysis before sharing — verify joins, aggregations, denominators, time ranges, and metric definitions. Detect pitfalls like survivorship bias, average-of-averages, join explosion, timezone mismatches, incomplete periods, and selection bias. Includes documentation templates for reproducible analyses.
anthropic-managed-agentsAnthropic Managed Agents API for programmatically creating, running, and streaming AI agents on Anthropic's cloud infrastructure. Use when the user mentions "Managed Agents", "Anthropic agent sessions", or needs to create/run/stream an Anthropic agent with tool use (bash, git, web), attach GitHub repositories, or inject secrets via Vault. Do NOT use for standard Claude Messages API — use the Claude API skill instead.
apifyApify web scraping platform. Use when user mentions "scrape website",
asanaAsana API for tasks and projects. Use when user mentions "Asana", "asana.com",
atlassianAtlassian API for Confluence and Jira. Use when user mentions "Confluence
attioAttio REST API for AI-native CRM operations — manage companies, people, deals, and custom objects, plus notes, tasks, lists, and comments. Use when the user mentions "Attio", "CRM record", "create company", "add person", "list entry", "CRM note", or "CRM task".