product-metrics
$
npx mdskill add vm0-ai/vm0-skills/product-metricsDefine, instrument, and analyze product metrics across key lifecycle stages.
- Helps set OKRs, design dashboards, and diagnose metric movements.
- Depends on data pipelines and analytics platforms for real-time insights.
- Selects KPIs using value-reflective, forward-looking, and influenceable criteria.
- Delivers structured frameworks linking North Star metrics to L1 indicators.
SKILL.md
.github/skills/product-metricsView on GitHub ↗
--- name: product-metrics description: Define, instrument, and analyze product metrics across acquisition, activation, engagement, retention, and monetization. Activate when setting OKRs, designing a metrics dashboard, running a weekly or monthly metrics review, diagnosing metric movements, choosing KPIs for a product area, building a metrics framework, or evaluating product health. --- ## Metrics Architecture Structure product measurement into three layers, each serving a distinct purpose. ### North Star Metric A single indicator that captures the fundamental value the product delivers. Selection criteria: - **Value-reflective**: Increases when users extract more benefit from the product - **Forward-looking**: Reliably predicts sustained business outcomes like revenue and retention - **Influenceable**: The product team's work can demonstrably move it - **Broadly understood**: Anyone in the organization can grasp its meaning and significance Illustrative North Star choices by product category: - Team collaboration platform: Weekly active teams where three or more members contribute - Two-sided marketplace: Weekly completed transactions - Enterprise SaaS: Weekly active users who execute the core workflow - Media or content product: Weekly minutes of engaged consumption - Developer tooling: Weekly production deployments facilitated by the tool ### L1 Indicators (Product Health) Five to seven metrics that collectively represent the full user lifecycle. Organized by lifecycle phase: **Acquisition** -- Are new users discovering the product? - Volume of new registrations or trial initiations and their trajectory - Visitor-to-registration conversion rate - Distribution across acquisition channels - Per-channel acquisition cost (for paid efforts) **Activation** -- Are newcomers reaching the value threshold? - Activation rate: fraction of new users who perform the action most predictive of retention - Time-to-activation: elapsed duration from registration to activation - Onboarding completion rate: fraction who finish the guided setup sequence - First value moment: the point at which users first experience the product's core promise **Engagement** -- Are active users deriving ongoing value? - Active user counts at daily, weekly, and monthly granularity (DAU, WAU, MAU) - Stickiness ratio (DAU divided by MAU): how habitual the product is - Core action frequency: how often users perform the most meaningful operation - Depth per session: volume of activity within a single visit - Feature penetration: share of users who adopt specific capabilities **Retention** -- Are users returning over time? - Cohort retention at standard intervals: day 1, day 7, day 30, day 90 - Retention curves by signup cohort showing decay and stabilization - Churn rate: fraction of users or revenue lost per period - Reactivation rate: fraction of previously lapsed users who return **Monetization** -- Is user value converting to revenue? - Free-to-paid conversion rate (for freemium models) - Monthly and annual recurring revenue (MRR / ARR) - Average revenue per user or account (ARPU / ARPA) - Expansion revenue: growth generated by existing customers - Net revenue retention: combined effect of expansion, contraction, and churn **Satisfaction** -- How do users perceive the experience? - Net Promoter Score (NPS) - Customer Satisfaction Score (CSAT) - Support ticket volume and mean resolution time - App store ratings and review sentiment analysis ### L2 Indicators (Diagnostic Detail) Granular metrics used to investigate why L1 indicators move: - Step-by-step funnel conversion rates - Per-feature usage and adoption measurements - Segment-level breakdowns: by plan tier, company size, geography, user role - Technical performance: page load latency, error rates, API response times - Content or feature-level engagement analysis: which surfaces drive the most activity ## Key Metric Deep Dives ### Active Users (DAU / WAU / MAU) **Definition**: Unique users who perform a qualifying action within a day, week, or month. **Critical design choices**: - Define "active" precisely. Logging in, loading a page, and executing a core action tell fundamentally different stories. - Match the timeframe to natural usage cadence. DAU for daily-use products (chat, email). WAU for weekly-use products (project tracking). MAU for episodic products (tax filing, travel booking). **Interpretation guidance**: - Stickiness (DAU/MAU) above 0.5 signals daily-habit status. Below 0.2 suggests sporadic engagement. - Trajectory matters more than absolute level. Watch for growth, plateau, or decline. - Segment by user archetype. Power users and occasional visitors exhibit vastly different patterns. ### Retention **Definition**: Of users who arrived in cohort X, what percentage remain active in period Y? **Standard measurement windows**: - Day 1: Was the initial experience compelling enough to prompt a return? - Day 7: Has the user begun forming a usage habit? - Day 30: Is the user retained at a meaningful horizon? - Day 90: Has the user become durably embedded? **Analytical approaches**: - Chart retention curves by cohort. Steep initial falloff signals an activation gap. Steady ongoing decline points to an engagement deficit. A flattening curve indicates a healthy stable base. - Compare cohorts chronologically. Improving retention in newer cohorts confirms product improvements are landing. - Segment by onboarding completion or feature adoption to isolate what behaviors predict lasting retention. ### Funnel Conversion **Definition**: The percentage of users who advance from one lifecycle stage to the next. **Typical funnels to instrument**: - Visitor to registration - Registration to activation (first value moment) - Free user to paying customer - Trial to subscription - Monthly plan to annual plan **Analytical approaches**: - Map the entire funnel and measure conversion at every transition - Locate the steepest drop-offs -- these represent the highest-leverage optimization targets - Segment conversion by traffic source, plan type, and user profile; different populations convert at very different rates - Monitor conversion trends over time to gauge whether iterative improvements are working ### Activation Rate **Definition**: The fraction of new users who reach the experience where they first realize the product's core value. **Identifying the activation event**: - Compare behavioral data for retained users versus churned users. What actions distinguish the two groups? - The activation event should strongly predict long-term retention - It should be reachable within the first session or first few days - Examples: created a first project, invited a collaborator, completed the primary workflow, connected an external integration **Operational use**: - Track activation rate for every registration cohort - Measure time-to-activation; shorter intervals almost always correlate with better outcomes - Design onboarding sequences that steer users toward the activation moment - When testing onboarding changes, evaluate impact on downstream retention, not just activation rate in isolation ## Goal-Setting Methodology ### OKR Framework (Objectives and Key Results) **Objectives**: Qualitative, motivating statements of what the team aims to accomplish. - Memorable and directionally inspiring - Bounded to a time period (quarter or half) - Focused on outcomes, not feature lists **Key Results**: Quantitative evidence that the objective has been met. - Specific, measurable, and time-bound - Framed as outcomes rather than outputs - Two to four Key Results per Objective **Worked example**: ``` Objective: Become an essential part of our users' daily routine Key Results: - Raise DAU/MAU stickiness from 0.35 to 0.50 - Improve 30-day retention for new cohorts from 40% to 55% - Achieve >80% task completion rate across three primary workflows ``` ### OKR Operating Principles - Aim for ambitious-but-plausible targets. Achieving roughly 70% of a stretch OKR signals proper calibration. - Key Results measure user and business outcomes, not team output like features shipped or story points completed. - Constrain scope: two to three Objectives with two to four Key Results each prevents dilution. - If the team is confident of hitting every Key Result, ambition is too low. - Conduct a mid-period checkpoint. Reallocate effort toward off-track Key Results if warranted. - Score honestly at period's end: 0.0-0.3 = missed, 0.4-0.6 = partial progress, 0.7-1.0 = delivered. ### Calibrating Metric Targets - **Baseline**: Establish the current value with reliable measurement before committing to a target. - **External benchmarks**: Reference what comparable products or industry reports indicate is achievable. - **Existing trajectory**: If the metric already trends upward at 5% monthly, targeting 6% is not ambitious. - **Planned investment**: Larger bets justify bolder targets. - **Confidence bands**: Set a "commit" level (high confidence) and a "stretch" level (aspirational). ## Review Cadences ### Weekly Health Check **Objective**: Detect anomalies early, monitor active experiments, maintain situational awareness. **Duration**: 15-30 minutes. **Participants**: Product manager, optionally the engineering lead. **Agenda**: - North Star metric: current value and week-over-week delta - L1 indicators: flag any notable movements - Active experiments: interim results and statistical power - Anomaly scan: unexpected spikes, drops, or pattern breaks - Triggered alerts: anything that crossed a monitoring threshold **Outcome**: If something is off, open an investigation. Otherwise, log observations and proceed. ### Monthly Deep Dive **Objective**: Assess trends in context, measure progress toward quarterly targets, identify strategic implications. **Duration**: 30-60 minutes. **Participants**: Product team and key stakeholders. **Agenda**: - Full L1 scorecard with month-over-month trends - OKR progress: are Key Results on trajectory? - Cohort health: are more recent cohorts outperforming earlier ones? - Launch performance: how are recently shipped features tracking? - Segment divergence: are any user segments behaving differently than expected? **Outcome**: Identify one to three areas warranting deeper investigation or adjusted investment. Update priorities if metrics surface new insights. ### Quarterly Strategic Review **Objective**: Evaluate the quarter holistically, set direction for the next period. **Duration**: 60-90 minutes. **Participants**: Product, engineering, design, and leadership. **Agenda**: - OKR final scoring for the quarter - L1 trend analysis spanning the full quarter - Year-over-year comparisons for context - Competitive and market backdrop: relevant shifts and competitor moves - Retrospective: what delivered expected results and what did not **Outcome**: Set OKRs for the upcoming quarter. Recalibrate product strategy based on accumulated evidence. ## Dashboard Design ### Guiding Principles A well-constructed dashboard answers "how is the product performing?" at a glance. 1. **Design from the decision backward**. Identify which decisions the dashboard informs before selecting metrics. 2. **Enforce visual hierarchy**. The highest-stakes metric gets the most prominent placement. North Star at the top, L1 indicators below, L2 detail accessible through drill-down. 3. **Always provide context**. A raw number in isolation conveys nothing. Pair every metric with: prior-period comparison, target value, and trend direction. 4. **Favor signal density over metric count**. Five to ten carefully chosen indicators outperform fifty superficial ones. Relegate the rest to a supplementary report. 5. **Standardize time windows**. Display all metrics over the same period. Mixing daily and monthly granularity on one screen breeds confusion. 6. **Use color for instant status**: - Green: on track or trending favorably - Yellow: warrants attention or trending flat - Red: off track or declining 7. **Every metric must be actionable**. If the team cannot influence a measurement, it does not earn a place on the product dashboard. ### Recommended Layout **Row 1**: North Star metric with trend line and target overlay. **Row 2**: L1 health scorecard -- current value, period change, target, and status indicator for each metric. **Row 3**: Key funnels -- visual conversion funnel with drop-off rates at each stage. **Row 4**: Experiment and launch tracker -- active tests with preliminary results, recent releases with early performance data. **Drill-down layer**: L2 diagnostic metrics, segment breakdowns, and extended time-series charts for investigation. ### Dashboard Pitfalls - **Vanity metrics**: Cumulative totals that only climb (all-time signups, lifetime page views) without indicating health - **Metric overload**: Dashboards that require scrolling. If it does not fit on a single screen, trim the metric set. - **Missing baselines**: Numbers shown without prior-period comparison or target reference - **Abandoned dashboards**: Metrics that have not been reviewed or refreshed in months - **Activity metrics masquerading as outcomes**: Measuring internal throughput (tickets closed, pull requests merged) instead of user and business results - **One-size-fits-all views**: Executives, product managers, and engineers need different dashboards. A single view serves none of them well. ### Alerting Strategy Configure automated alerts for metrics that demand prompt response: - **Threshold alerts**: A metric breaches a predefined boundary (error rate exceeds 1%, conversion falls below 5%) - **Trend alerts**: A metric shows sustained decline across multiple consecutive periods - **Anomaly alerts**: A metric deviates significantly from its expected range **Alert hygiene practices**: - Every alert must have a corresponding action plan. If nothing can be done, remove the alert. - Review and recalibrate alerts periodically. Excessive false positives train teams to ignore all signals. - Assign a designated responder for each alert category. - Differentiate severity tiers. Not every alert warrants an emergency response.