discrepancy-identification

Name: discrepancy-identification
Author: yogsoth-ai/de-anthropocentric-research-engine

$npx mdskill add yogsoth-ai/de-anthropocentric-research-engine/discrepancy-identification

Detect statistically significant discrepancies between scores reported for the same method across different sources. Identifies potential score inflation, implementation bugs, evaluation protocol differences, and unreliable baselines.

SKILL.md

.github/skills/discrepancy-identificationView on GitHub ↗

---
name: discrepancy-identification
description: Compare same-method scores across sources, flag significant deviations
execution: subagent
prompt: ./prompt.md
input: score_pairs (source_a, source_b, method, dataset)
used-by: baseline-establishment
---

# Discrepancy Identification


## Purpose

Detect statistically significant discrepancies between scores reported for the same method across different sources. Identifies potential score inflation, implementation bugs, evaluation protocol differences, and unreliable baselines.

## Input Schema

| Field | Type | Description |
|-------|------|-------------|
| score_pairs | object[] | Array of {source_a, source_b, method, dataset, metric, score_a, score_b, conditions_a, conditions_b} |

## Output Schema

```json
{
  "comparisons": [
    {
      "method": "string",
      "dataset": "string",
      "metric": "string",
      "score_a": 0.0,
      "source_a": "string",
      "score_b": 0.0,
      "source_b": "string",
      "absolute_delta": 0.0,
      "relative_delta_pct": 0.0,
      "is_significant": true,
      "likely_cause": "string",
      "confidence": "high|medium|low"
    }
  ],
  "flagged_methods": [
    {
      "method": "string",
      "num_discrepancies": 0,
      "max_delta": 0.0,
      "reliability_assessment": "string"
    }
  ],
  "systematic_patterns": ["string"]
}
```

More from yogsoth-ai/de-anthropocentric-research-engine

Skill	Description
abductive-hypothesis-generation	Strategy: 面对异常的最佳解释推理
ablation-brainstorm	Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
ablation-component-mapping	Map system architecture to ablatable units for ablation studies
ablation-design	Design ablation studies to isolate component contributions in ML systems
ablation-execution	Remove components one by one from a system, record the response/impact of each removal.
abp-vulnerability-classification	Classify assumptions on 2 axes — load-bearing (how much conclusion depends on it) × vulnerable (how likely to be false). Focuses attention on High-Load × High-Vulnerable quadrant.
abstraction-extraction	Extract abstract principles from concrete domain cases. Strips domain-specific details to reveal transferable mechanisms.
abstraction-ladder	Perform bisociation at multiple abstraction levels
abstraction-laddering	Move between concrete and abstract framings — 3 levels up (Why?) and 3 levels down (How?) to find the most productive research level.
abstraction-to-design	Abstract biological principle to design principle. Bridge from biology to engineering.