confidence-calibration

Name: confidence-calibration
Author: yogsoth-ai/de-anthropocentric-research-engine

$npx mdskill add yogsoth-ai/de-anthropocentric-research-engine/confidence-calibration

Calibrates confidence scores during debate progression to guide next steps

Solves the problem of determining when to escalate, continue, or terminate a debate
Depends on judge verdicts, confidence history, and remaining budget inputs
Analyzes cumulative evidence and trajectory to make objective decisions
Returns calibrated confidence, decision, reasoning, and saturation status

SKILL.md

.github/skills/confidence-calibrationView on GitHub ↗

---
name: confidence-calibration
description: Calibrates confidence scores based on debate progression. Determines whether to escalate, continue, or terminate based on cumulative evidence.
execution: subagent
prompt: ./prompt.md
input: round_verdicts (string), confidence_history (string), budget_remaining (string)
used-by: [multiagent-debate]
---

# Confidence Calibration

Calibrates confidence based on debate progression.

## Execution

Subagent — spawned via subagent-spawning/spawn-agent.

## Why Subagent

Calibration requires meta-analysis of debate trajectory without being anchored to any single round's outcome. Isolated context enables objective trend assessment.

## Input

- **round_verdicts**: All judge verdicts so far
- **confidence_history**: Confidence scores from each round
- **budget_remaining**: Rounds/searches remaining in budget

## Output

- **calibrated_confidence**: Updated confidence in artifact viability (0.0–1.0)
- **decision**: escalate / continue / terminate
- **reasoning**: Why this decision given the trajectory
- **saturation_flag**: Whether debate is producing diminishing returns

## Budget

One unit = one calibration assessment per round.

More from yogsoth-ai/de-anthropocentric-research-engine

Skill	Description
abductive-hypothesis-generation	Strategy: 面对异常的最佳解释推理
ablation-brainstorm	Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
ablation-component-mapping	Map system architecture to ablatable units for ablation studies
ablation-design	Design ablation studies to isolate component contributions in ML systems
ablation-execution	Remove components one by one from a system, record the response/impact of each removal.
abp-vulnerability-classification	Classify assumptions on 2 axes — load-bearing (how much conclusion depends on it) × vulnerable (how likely to be false). Focuses attention on High-Load × High-Vulnerable quadrant.
abstraction-extraction	Extract abstract principles from concrete domain cases. Strips domain-specific details to reveal transferable mechanisms.
abstraction-ladder	Perform bisociation at multiple abstraction levels
abstraction-laddering	Move between concrete and abstract framings — 3 levels up (Why?) and 3 levels down (How?) to find the most productive research level.
abstraction-to-design	Abstract biological principle to design principle. Bridge from biology to engineering.