product-metrics

Name: product-metrics
Author: vm0-ai/vm0-skills

$npx mdskill add vm0-ai/vm0-skills/product-metrics

Define, instrument, and analyze product metrics across key lifecycle stages.

Helps set OKRs, design dashboards, and diagnose metric movements.
Depends on data pipelines and analytics platforms for real-time insights.
Selects KPIs using value-reflective, forward-looking, and influenceable criteria.
Delivers structured frameworks linking North Star metrics to L1 indicators.

SKILL.md

.github/skills/product-metricsView on GitHub ↗

---
name: product-metrics
description: Define, instrument, and analyze product metrics across acquisition, activation, engagement, retention, and monetization. Activate when setting OKRs, designing a metrics dashboard, running a weekly or monthly metrics review, diagnosing metric movements, choosing KPIs for a product area, building a metrics framework, or evaluating product health.
---

## Metrics Architecture

Structure product measurement into three layers, each serving a distinct purpose.

### North Star Metric

A single indicator that captures the fundamental value the product delivers. Selection criteria:

- **Value-reflective**: Increases when users extract more benefit from the product
- **Forward-looking**: Reliably predicts sustained business outcomes like revenue and retention
- **Influenceable**: The product team's work can demonstrably move it
- **Broadly understood**: Anyone in the organization can grasp its meaning and significance

Illustrative North Star choices by product category:
- Team collaboration platform: Weekly active teams where three or more members contribute
- Two-sided marketplace: Weekly completed transactions
- Enterprise SaaS: Weekly active users who execute the core workflow
- Media or content product: Weekly minutes of engaged consumption
- Developer tooling: Weekly production deployments facilitated by the tool

### L1 Indicators (Product Health)

Five to seven metrics that collectively represent the full user lifecycle. Organized by lifecycle phase:

**Acquisition** -- Are new users discovering the product?
- Volume of new registrations or trial initiations and their trajectory
- Visitor-to-registration conversion rate
- Distribution across acquisition channels
- Per-channel acquisition cost (for paid efforts)

**Activation** -- Are newcomers reaching the value threshold?
- Activation rate: fraction of new users who perform the action most predictive of retention
- Time-to-activation: elapsed duration from registration to activation
- Onboarding completion rate: fraction who finish the guided setup sequence
- First value moment: the point at which users first experience the product's core promise

**Engagement** -- Are active users deriving ongoing value?
- Active user counts at daily, weekly, and monthly granularity (DAU, WAU, MAU)
- Stickiness ratio (DAU divided by MAU): how habitual the product is
- Core action frequency: how often users perform the most meaningful operation
- Depth per session: volume of activity within a single visit
- Feature penetration: share of users who adopt specific capabilities

**Retention** -- Are users returning over time?
- Cohort retention at standard intervals: day 1, day 7, day 30, day 90
- Retention curves by signup cohort showing decay and stabilization
- Churn rate: fraction of users or revenue lost per period
- Reactivation rate: fraction of previously lapsed users who return

**Monetization** -- Is user value converting to revenue?
- Free-to-paid conversion rate (for freemium models)
- Monthly and annual recurring revenue (MRR / ARR)
- Average revenue per user or account (ARPU / ARPA)
- Expansion revenue: growth generated by existing customers
- Net revenue retention: combined effect of expansion, contraction, and churn

**Satisfaction** -- How do users perceive the experience?
- Net Promoter Score (NPS)
- Customer Satisfaction Score (CSAT)
- Support ticket volume and mean resolution time
- App store ratings and review sentiment analysis

### L2 Indicators (Diagnostic Detail)

Granular metrics used to investigate why L1 indicators move:

- Step-by-step funnel conversion rates
- Per-feature usage and adoption measurements
- Segment-level breakdowns: by plan tier, company size, geography, user role
- Technical performance: page load latency, error rates, API response times
- Content or feature-level engagement analysis: which surfaces drive the most activity

## Key Metric Deep Dives

### Active Users (DAU / WAU / MAU)

**Definition**: Unique users who perform a qualifying action within a day, week, or month.

**Critical design choices**:
- Define "active" precisely. Logging in, loading a page, and executing a core action tell fundamentally different stories.
- Match the timeframe to natural usage cadence. DAU for daily-use products (chat, email). WAU for weekly-use products (project tracking). MAU for episodic products (tax filing, travel booking).

**Interpretation guidance**:
- Stickiness (DAU/MAU) above 0.5 signals daily-habit status. Below 0.2 suggests sporadic engagement.
- Trajectory matters more than absolute level. Watch for growth, plateau, or decline.
- Segment by user archetype. Power users and occasional visitors exhibit vastly different patterns.

### Retention

**Definition**: Of users who arrived in cohort X, what percentage remain active in period Y?

**Standard measurement windows**:
- Day 1: Was the initial experience compelling enough to prompt a return?
- Day 7: Has the user begun forming a usage habit?
- Day 30: Is the user retained at a meaningful horizon?
- Day 90: Has the user become durably embedded?

**Analytical approaches**:
- Chart retention curves by cohort. Steep initial falloff signals an activation gap. Steady ongoing decline points to an engagement deficit. A flattening curve indicates a healthy stable base.
- Compare cohorts chronologically. Improving retention in newer cohorts confirms product improvements are landing.
- Segment by onboarding completion or feature adoption to isolate what behaviors predict lasting retention.

### Funnel Conversion

**Definition**: The percentage of users who advance from one lifecycle stage to the next.

**Typical funnels to instrument**:
- Visitor to registration
- Registration to activation (first value moment)
- Free user to paying customer
- Trial to subscription
- Monthly plan to annual plan

**Analytical approaches**:
- Map the entire funnel and measure conversion at every transition
- Locate the steepest drop-offs -- these represent the highest-leverage optimization targets
- Segment conversion by traffic source, plan type, and user profile; different populations convert at very different rates
- Monitor conversion trends over time to gauge whether iterative improvements are working

### Activation Rate

**Definition**: The fraction of new users who reach the experience where they first realize the product's core value.

**Identifying the activation event**:
- Compare behavioral data for retained users versus churned users. What actions distinguish the two groups?
- The activation event should strongly predict long-term retention
- It should be reachable within the first session or first few days
- Examples: created a first project, invited a collaborator, completed the primary workflow, connected an external integration

**Operational use**:
- Track activation rate for every registration cohort
- Measure time-to-activation; shorter intervals almost always correlate with better outcomes
- Design onboarding sequences that steer users toward the activation moment
- When testing onboarding changes, evaluate impact on downstream retention, not just activation rate in isolation

## Goal-Setting Methodology

### OKR Framework (Objectives and Key Results)

**Objectives**: Qualitative, motivating statements of what the team aims to accomplish.
- Memorable and directionally inspiring
- Bounded to a time period (quarter or half)
- Focused on outcomes, not feature lists

**Key Results**: Quantitative evidence that the objective has been met.
- Specific, measurable, and time-bound
- Framed as outcomes rather than outputs
- Two to four Key Results per Objective

**Worked example**:
```
Objective: Become an essential part of our users' daily routine

Key Results:
- Raise DAU/MAU stickiness from 0.35 to 0.50
- Improve 30-day retention for new cohorts from 40% to 55%
- Achieve >80% task completion rate across three primary workflows
```

### OKR Operating Principles

- Aim for ambitious-but-plausible targets. Achieving roughly 70% of a stretch OKR signals proper calibration.
- Key Results measure user and business outcomes, not team output like features shipped or story points completed.
- Constrain scope: two to three Objectives with two to four Key Results each prevents dilution.
- If the team is confident of hitting every Key Result, ambition is too low.
- Conduct a mid-period checkpoint. Reallocate effort toward off-track Key Results if warranted.
- Score honestly at period's end: 0.0-0.3 = missed, 0.4-0.6 = partial progress, 0.7-1.0 = delivered.

### Calibrating Metric Targets

- **Baseline**: Establish the current value with reliable measurement before committing to a target.
- **External benchmarks**: Reference what comparable products or industry reports indicate is achievable.
- **Existing trajectory**: If the metric already trends upward at 5% monthly, targeting 6% is not ambitious.
- **Planned investment**: Larger bets justify bolder targets.
- **Confidence bands**: Set a "commit" level (high confidence) and a "stretch" level (aspirational).

## Review Cadences

### Weekly Health Check

**Objective**: Detect anomalies early, monitor active experiments, maintain situational awareness.
**Duration**: 15-30 minutes.
**Participants**: Product manager, optionally the engineering lead.

**Agenda**:
- North Star metric: current value and week-over-week delta
- L1 indicators: flag any notable movements
- Active experiments: interim results and statistical power
- Anomaly scan: unexpected spikes, drops, or pattern breaks
- Triggered alerts: anything that crossed a monitoring threshold

**Outcome**: If something is off, open an investigation. Otherwise, log observations and proceed.

### Monthly Deep Dive

**Objective**: Assess trends in context, measure progress toward quarterly targets, identify strategic implications.
**Duration**: 30-60 minutes.
**Participants**: Product team and key stakeholders.

**Agenda**:
- Full L1 scorecard with month-over-month trends
- OKR progress: are Key Results on trajectory?
- Cohort health: are more recent cohorts outperforming earlier ones?
- Launch performance: how are recently shipped features tracking?
- Segment divergence: are any user segments behaving differently than expected?

**Outcome**: Identify one to three areas warranting deeper investigation or adjusted investment. Update priorities if metrics surface new insights.

### Quarterly Strategic Review

**Objective**: Evaluate the quarter holistically, set direction for the next period.
**Duration**: 60-90 minutes.
**Participants**: Product, engineering, design, and leadership.

**Agenda**:
- OKR final scoring for the quarter
- L1 trend analysis spanning the full quarter
- Year-over-year comparisons for context
- Competitive and market backdrop: relevant shifts and competitor moves
- Retrospective: what delivered expected results and what did not

**Outcome**: Set OKRs for the upcoming quarter. Recalibrate product strategy based on accumulated evidence.

## Dashboard Design

### Guiding Principles

A well-constructed dashboard answers "how is the product performing?" at a glance.

1. **Design from the decision backward**. Identify which decisions the dashboard informs before selecting metrics.

2. **Enforce visual hierarchy**. The highest-stakes metric gets the most prominent placement. North Star at the top, L1 indicators below, L2 detail accessible through drill-down.

3. **Always provide context**. A raw number in isolation conveys nothing. Pair every metric with: prior-period comparison, target value, and trend direction.

4. **Favor signal density over metric count**. Five to ten carefully chosen indicators outperform fifty superficial ones. Relegate the rest to a supplementary report.

5. **Standardize time windows**. Display all metrics over the same period. Mixing daily and monthly granularity on one screen breeds confusion.

6. **Use color for instant status**:
- Green: on track or trending favorably
- Yellow: warrants attention or trending flat
- Red: off track or declining

7. **Every metric must be actionable**. If the team cannot influence a measurement, it does not earn a place on the product dashboard.

### Recommended Layout

**Row 1**: North Star metric with trend line and target overlay.

**Row 2**: L1 health scorecard -- current value, period change, target, and status indicator for each metric.

**Row 3**: Key funnels -- visual conversion funnel with drop-off rates at each stage.

**Row 4**: Experiment and launch tracker -- active tests with preliminary results, recent releases with early performance data.

**Drill-down layer**: L2 diagnostic metrics, segment breakdowns, and extended time-series charts for investigation.

### Dashboard Pitfalls

- **Vanity metrics**: Cumulative totals that only climb (all-time signups, lifetime page views) without indicating health
- **Metric overload**: Dashboards that require scrolling. If it does not fit on a single screen, trim the metric set.
- **Missing baselines**: Numbers shown without prior-period comparison or target reference
- **Abandoned dashboards**: Metrics that have not been reviewed or refreshed in months
- **Activity metrics masquerading as outcomes**: Measuring internal throughput (tickets closed, pull requests merged) instead of user and business results
- **One-size-fits-all views**: Executives, product managers, and engineers need different dashboards. A single view serves none of them well.

### Alerting Strategy

Configure automated alerts for metrics that demand prompt response:

- **Threshold alerts**: A metric breaches a predefined boundary (error rate exceeds 1%, conversion falls below 5%)
- **Trend alerts**: A metric shows sustained decline across multiple consecutive periods
- **Anomaly alerts**: A metric deviates significantly from its expected range

**Alert hygiene practices**:
- Every alert must have a corresponding action plan. If nothing can be done, remove the alert.
- Review and recalibrate alerts periodically. Excessive false positives train teams to ignore all signals.
- Assign a designated responder for each alert category.
- Differentiate severity tiers. Not every alert warrants an emergency response.

More from vm0-ai/vm0-skills

Skill	Description
account-reconciliation	Perform account reconciliations comparing general ledger balances against subledgers, bank statements, or external records. Use for bank reconciliation, GL-to-subledger reconciliation, intercompany reconciliation, balance sheet reconciliation, reconciling item analysis, outstanding item aging, or clearing open items.
agentphone	Build AI phone agents with AgentPhone API. Use when the user wants to make phone calls, send/receive SMS, manage phone numbers, create voice agents, set up webhooks, or check usage — anything related to telephony, phone numbers, or voice AI.
ahrefs	Ahrefs SEO API for backlink and keyword analysis. Use when user mentions
amplitude	Amplitude product analytics API. Use when user mentions "Amplitude",
analysis-qa	Quality-check a data analysis before sharing — verify joins, aggregations, denominators, time ranges, and metric definitions. Detect pitfalls like survivorship bias, average-of-averages, join explosion, timezone mismatches, incomplete periods, and selection bias. Includes documentation templates for reproducible analyses.
anthropic-managed-agents	Anthropic Managed Agents API for programmatically creating, running, and streaming AI agents on Anthropic's cloud infrastructure. Use when the user mentions "Managed Agents", "Anthropic agent sessions", or needs to create/run/stream an Anthropic agent with tool use (bash, git, web), attach GitHub repositories, or inject secrets via Vault. Do NOT use for standard Claude Messages API — use the Claude API skill instead.
apify	Apify web scraping platform. Use when user mentions "scrape website",
asana	Asana API for tasks and projects. Use when user mentions "Asana", "asana.com",
atlassian	Atlassian API for Confluence and Jira. Use when user mentions "Confluence
attio	Attio REST API for AI-native CRM operations — manage companies, people, deals, and custom objects, plus notes, tasks, lists, and comments. Use when the user mentions "Attio", "CRM record", "create company", "add person", "list entry", "CRM note", or "CRM task".