diffdock-molecular-docking

$npx mdskill add aipoch/medical-research-skills/diffdock-molecular-docking

Generate ranked 3D ligand binding poses for drug discovery.

  • Predicts binding conformations when no known site exists.
  • Uses PyTorch diffusion models for generative sampling.
  • Ranks poses using a confidence scoring mechanism.
  • Delivers structured SDF files and score reports.

SKILL.md

.github/skills/diffdock-molecular-dockingView on GitHub ↗
---
name: diffdock-molecular-docking
description: Diffusion-based molecular docking to predict 3D ligand–protein binding poses (blind docking) with confidence scoring; use when you need pose prediction for drug discovery or virtual screening.
license: MIT
author: aipoch
---
> **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills)

# DiffDock Molecular Docking

## When to Use

- **Blind docking** when you have a protein structure (PDB) and a ligand (SMILES) but no known binding site.
- **Pose prediction** to generate multiple plausible 3D binding conformations and rank them.
- **Virtual screening support** to quickly evaluate candidate ligands by predicted binding poses and confidence.
- **Drug discovery workflows** where you need automated docking outputs (SDF poses + scores) for downstream analysis.
- **Batch/advanced docking** when running many ligand–protein pairs or using alternative inputs (e.g., sequence-based workflows; see `references/workflows_examples.md`).

## Key Features

- **Diffusion generative sampling** to produce diverse ligand binding poses.
- **Confidence model scoring** to rank predicted poses.
- **Simple CLI inference** for single protein–ligand docking.
- **Batch/advanced workflows** documented in `references/workflows_examples.md`.
- **Structured outputs** including ranked SDF pose files and a confidence score report.

## Dependencies

- Python (version not specified)
- PyTorch (version not specified)
- PyTorch Geometric / PyG (version not specified)
- RDKit (version not specified)
- ESM (version not specified)

## Example Usage

### 1) Verify the Environment

```bash
python scripts/setup_check.py
```

### 2) Run Standard Inference (Single Docking)

Dock a single ligand (SMILES) to a protein structure (PDB) and write results to an output directory:

```bash
python scripts/inference_runner.py \
  --protein ./data/protein.pdb \
  --ligand "CC(=O)Oc1ccccc1C(=O)O" \
  --out_dir ./results
```

**Arguments**
- `--protein`: Path to the protein PDB file.
- `--ligand`: Ligand SMILES string.
- `--out_dir`: Output directory (default: `results/`).

### 3) Outputs

After inference, the tool produces:

- **Ranked SDF pose files** (e.g., `rank1.sdf`, `rank2.sdf`, ...), each containing a predicted 3D binding pose.
- **Confidence score report**: `confidence_scores.txt`, listing the score for each ranked pose.

## Implementation Details

- **Pose generation**: Uses a diffusion-based generative model to sample multiple candidate ligand poses relative to the protein target.
- **Ranking**: A separate confidence model assigns a score to each sampled pose; poses are sorted by this score and saved as `rank*.sdf`.
- **Parameterization**:
  - For the complete CLI argument list and defaults, see `references/parameters_reference.md`.
  - For confidence interpretation, known limitations, and expected accuracy/scope, see `references/confidence_and_limitations.md`.
- **Advanced workflows**: Batch processing and alternative input configurations are documented in `references/workflows_examples.md`.

More from aipoch/medical-research-skills

SkillDescription
3d-molecule-ray-tracerGenerate photorealistic rendering scripts for PyMOL and UCSF ChimeraX.
abstract-summarizerTransform lengthy academic papers into concise, structured 250-word abstracts.
abstract-trimmerPrecision editing tool that reduces abstract word count through intelligent compression techniques, maintaining scientific rigor while meeting strict journal and conference requirements.
academic-abstract-refinerRefines long medical academic texts into SCI-style unstructured Chinese and English abstracts; use when you need to condense drafts/reports/summaries into bilingual abstracts and generate Summary_Report.md.
academic-cv-generatorGenerate structured academic CVs from free-form Chinese/English text and export to Word (.docx). Use this skill when you are asked to organize, generate, or optimize an academic CV (e.g., publications/projects/awards) into a consistent, formatted document with uniform-colored section headers and optional bilingual output.
academic-highlight-generatorGenerates submission-ready Elsevier/SCI Highlights from manuscript text or extracted PDF/DOCX/TXT content. Use when a user needs 3-5 concise, evidence-grounded highlight bullets for a research paper, review, meta-analysis, case report, or bioinformatics manuscript.
academic-norm-reviewDetects content similarity, verifies standardized citations and abbreviations, and flags potential academic integrity risks; use it before submission, during academic writing QA, or for compliance reviews.
academic-poster-generatorComplete workflow for generating academic research posters from PDF literature; use when you need to extract paper content from PDFs and produce a LaTeX-based poster (beamerposter/tikzposter/baposter) with mandatory figure generation and a final rendered HTML deliverable.
acronym-unpackerIntelligent medical abbreviation disambiguation tool that resolves ambiguous acronyms using clinical context, specialty-specific knowledge, and document-level semantic analysis.
active-comparator-single-soc-faers-safety-comparisonGenerates complete FAERS pharmacovigilance study designs for multi-drug or class-level safety comparison inside one predefined SOC or AE family using active comparators, disproportionality analysis, subgroup characterization, and reviewer-facing evidence control.