compress

Name: compress
Author: cwinvestments/memstack

$npx mdskill add cwinvestments/memstack/compress

Monitors and manages Headroom proxy for Claude Code context compression

Checks proxy status and reports compression stats on demand
Relies on Headroom proxy integration with Claude Code and Anthropic API
Activates based on user queries about token usage or context window limits
Delivers actionable insights to optimize session performance and cost

SKILL.md

.github/skills/compressView on GitHub ↗

---
name: compress
description: "Use when the user says 'headroom', 'compression', 'token savings', 'proxy status', or asks about context window usage."
version: 1.0.0
---


# ⚙️ Compress — Headroom Proxy Manager
*Monitor and manage Headroom context compression for CC sessions.*

## Activation

When this skill activates, output:

`⚙️ Compress — Checking Headroom status...`

Then execute the protocol below.

- **Keywords:** headroom, compression stats, token savings, proxy status, check headroom
- **Contextual:** When user asks about token usage, context window limits, or session cost optimization
- **Level:** 2 (explicit trigger only)

## Context Guard

| Context | Status |
|---------|--------|
| **User says "headroom", "compression stats", "check proxy"** | ACTIVE — run status check |
| **User asks about token savings or context window** | ACTIVE — run session report |
| **Proxy errors or API connection failures appear** | ACTIVE — run health diagnostics |
| **General discussion about CC features** | DORMANT — do not activate |
| **User is actively coding (no proxy issues)** | DORMANT — do not activate |

## What It Does

Headroom is a transparent proxy between Claude Code and the Anthropic API that compresses tool outputs by removing redundant boilerplate. It extends effective context window by 30–40%.

This skill checks proxy health, reports compression stats, and troubleshoots connection issues.

## Prerequisites

- **Headroom installed:** `pip install headroom-ai[code]`
  - The `[code]` extra installs tree-sitter for AST-based code compression. Without it, Code-Aware compression is disabled and CC sessions get 0% compression.
- **Proxy running:** `headroom proxy --llmlingua-device cpu` (defaults to `localhost:8787`)
- **CC configured:** `ANTHROPIC_BASE_URL=http://127.0.0.1:8787`

### Recommended Startup

```bash
headroom proxy --llmlingua-device cpu
```

- `--llmlingua-device cpu` — Forces LLMLingua to use CPU (avoids silent CUDA failures on machines without GPU)
- Default port is 8787, no other flags needed
- Code-Aware and LLMLingua both load lazily when relevant content is detected

## Workflow

### 1. Status Check

Run:
```bash
curl -s http://127.0.0.1:8787/stats | python -m json.tool
```

Report: proxy up/down, requests processed, compression ratio, tokens saved, estimated cost savings.

### 2. Health Diagnostics

If proxy is unreachable:

1. Check if process is running:
   ```bash
   # Windows
   tasklist | findstr headroom
   # Linux/macOS
   ps aux | grep headroom
   ```
2. Check port binding:
   ```bash
   netstat -ano | findstr 8787
   ```
3. Verify `ANTHROPIC_BASE_URL` is set:
   ```bash
   echo $ANTHROPIC_BASE_URL
   ```
4. Restart: `headroom proxy` in a separate terminal

### 3. Session Report

When triggered at session end or on request, report:

- Requests this session
- Tokens before/after compression
- Compression ratio (target: 30–40%)
- Estimated dollar savings (at $15/MTok input, $75/MTok output for Opus)

### 4. Configuration Reference

| Setting | Value | Notes |
|---------|-------|-------|
| Proxy URL | `http://127.0.0.1:8787` | Default port |
| Dashboard | AdminStack Infrastructure tab | Headroom monitoring panel |
| Repo | `github.com/chopratejas/headroom` | Apache 2.0 |
| Python | 3.14 compatible | Tested Feb 2026 |

## Troubleshooting

| Symptom | Fix |
|---------|-----|
| **0% compression / 0.00x ratio** | `headroom-ai[code]` is not installed. Run: `pip install headroom-ai[code]`. Restart proxy. |
| **"Code-Aware: NOT INSTALLED" in startup banner** | Same fix — install the `[code]` extra and restart. |
| **Cost figures don't match Anthropic Console** | Headroom estimates costs at list token prices without accounting for Anthropic's server-side prompt caching discounts. For actual costs, check console.anthropic.com. |

## Output Format

```
⚙️ Headroom Status
├── Proxy: ✅ Running on :8787
├── Requests: 47 processed
├── Compression: 46.2% reduction
├── Tokens saved: ~18,500 tokens
└── Cost savings: ~$0.28 this session
```

## Integration

- **AdminStack:** Infrastructure page has Headroom tab with live dashboard
- **CC Sessions:** Auto-routed when `ANTHROPIC_BASE_URL` is set
- **Monitoring:** Stats endpoint polled every 30s with visibility-aware polling

## Level History

- **Lv.1** — Base: Health check and stats reporting for Headroom proxy. (Origin: MemStack v3.0, Feb 2026)
- **Lv.2** — Fixed: Added `[code]` extra for tree-sitter AST compression, updated startup flags (`--llmlingua-device cpu`), added troubleshooting. Compression 0% → 46%. (Feb 24, 2026)

More from cwinvestments/memstack

Skill	Description
diary	Use when the user says 'save diary', 'log session', 'wrapping up', or at end of a productive session.
echo	Use when the user references past sessions, asks 'what did we do', 'do you remember', 'last session', 'recall', or 'continue from'.
familiar	Use when the user says 'dispatch', 'send familiar', 'split task', or needs work split across parallel CC sessions.
forge	Use when the user says 'forge this', 'new skill', 'create enchantment', or wants to create a MemStack skill.
governor	Use when the user says 'new project', 'project init', 'what tier', 'scope', or discusses project maturity, complexity budget, or what's appropriate to build.
grimoire	Use when the user says 'update context', 'update claude', 'save library', or after significant project changes.
memstack-automation-api-integration	Use this skill when the user says 'API integration', 'connect APIs', 'sync data', 'data mapping', 'rate limiting', or needs system-to-system connectors with authentication, rate limit handling, and error recovery. Generates API integration code with authentication (OAuth, API key, JWT), request/response mapping, rate limit handling, error recovery with circuit breakers, and sync monitoring. Do NOT use for visual n8n workflows or webhook receiving.
memstack-automation-content-pipeline	Use this skill when the user says 'content pipeline', 'content automation', 'auto-publish', 'repurpose content', 'multi-platform publishing', or needs end-to-end content workflow from ideation through cross-platform formatting and publishing. Do NOT use for single social media posts or individual blog posts.
memstack-automation-cron-scheduler	Use this skill when the user says 'cron job', 'scheduled task', 'run every', 'cron expression', 'recurring job', or needs production-grade scheduled jobs with overlap prevention, monitoring, and structured logging. Do NOT use for n8n workflows or event-driven webhooks.
memstack-automation-hosted-mcp-catalog	Use when the user says 'what MCP servers', 'find an MCP for', 'hosted MCP', 'list MCP servers', 'MCP catalog', 'available MCP tools', or needs to discover zero-setup hosted MCP servers they can use immediately. Do NOT use for building MCP servers or configuring local MCP.