condition-normalization

Name: condition-normalization
Author: yogsoth-ai/de-anthropocentric-research-engine

$npx mdskill add yogsoth-ai/de-anthropocentric-research-engine/condition-normalization

Standardizes experimental conditions for fair comparison across research papers

Solves the problem of inconsistent evaluation settings in academic papers
Uses structured extraction and comparison of training, hardware, and hyperparameter data
Builds difference matrices and normalization rules based on variance and impact analysis
Delivers condition-normalized scores and structured comparison reports

SKILL.md

.github/skills/condition-normalizationView on GitHub ↗

---
name: condition-normalization
description: Compare and standardize experimental conditions across papers
execution: tactic
used-by: baseline-establishment
---

# Condition Normalization


## Purpose

Build a structured understanding of how experimental conditions vary across papers, then define normalization schemes that enable fair method-to-method comparison. Addresses the fundamental problem that papers evaluate under different settings, making raw score comparison misleading.

## Stages

### Stage 1: Condition Extraction

For each method-paper pair, extract all evaluation conditions:
- Training data: size, version, preprocessing, augmentation
- Hardware: GPU type, memory, training time
- Hyperparameters: learning rate schedule, batch size, epochs
- Evaluation protocol: data split version, ensembling, post-processing
- Random seeds: number of runs, seed selection, variance reported

**Yield**: Condition vectors per method-paper pair.

### Stage 2: Difference Matrix

Build a matrix showing which conditions differ across methods:
- Identify dimensions with high variance across papers
- Identify dimensions that are controlled (same across all)
- Quantify the impact of each dimension on reported scores (if literature exists)

**Yield**: Condition difference matrix with impact annotations.

### Stage 3: Normalization Scheme

Define rules for adjusting scores to common conditions:
- Compute-normalized comparison (score per FLOP)
- Data-normalized comparison (score per training example)
- Time-normalized comparison (score per GPU-hour)
- Identify which normalizations are valid vs. speculative

**Yield**: Normalization rule set with validity bounds.

### Stage 4: Fair Comparison Baseline

Apply normalization to produce fair comparison subsets:
- Controlled comparison: methods evaluated under identical conditions
- Adjusted comparison: methods with score adjustments applied
- Pareto frontier: compute-vs-performance optimal set

**Yield**: Fair comparison tables with methodology notes.

## Minimum Yield

| Metric | Floor |
|--------|-------|
| Condition dimensions cataloged | 5 |
| Methods with full condition vectors | 10 |
| Normalization rules defined | 3 |
| Fair comparison sets produced | 2 |

## SOPs Used

- condition-cataloging (for Stage 1)
- compute-normalization (for Stage 3)
- performance-table-assembly (for Stage 4)

More from yogsoth-ai/de-anthropocentric-research-engine

Skill	Description
abductive-hypothesis-generation	Strategy: 面对异常的最佳解释推理
ablation-brainstorm	Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
ablation-component-mapping	Map system architecture to ablatable units for ablation studies
ablation-design	Design ablation studies to isolate component contributions in ML systems
ablation-execution	Remove components one by one from a system, record the response/impact of each removal.
abp-vulnerability-classification	Classify assumptions on 2 axes — load-bearing (how much conclusion depends on it) × vulnerable (how likely to be false). Focuses attention on High-Load × High-Vulnerable quadrant.
abstraction-extraction	Extract abstract principles from concrete domain cases. Strips domain-specific details to reveal transferable mechanisms.
abstraction-ladder	Perform bisociation at multiple abstraction levels
abstraction-laddering	Move between concrete and abstract framings — 3 levels up (Why?) and 3 levels down (How?) to find the most productive research level.
abstraction-to-design	Abstract biological principle to design principle. Bridge from biology to engineering.