gap-analysis

$npx mdskill add openai/plugins/gap-analysis

Identifies missing or incomplete content in Datasite deal rooms before launch

  • Audits data room completeness to ensure nothing is missing or sparse
  • Leverages Datasite API to scan fileroom structure and content
  • Analyzes folder structure and document coverage against expected standards
  • Generates HTML dashboard and Excel register for team review and action

SKILL.md

.github/skills/gap-analysisView on GitHub ↗
---
name: gap-analysis
description: >
  Data Room Gap Analysis skill for Datasite deal rooms. Use this skill whenever a
  sell-side deal team wants to audit what is missing, sparse, or incomplete in their
  data room before going live to buyers. Triggers include: "run a gap analysis",
  "what's missing from the data room", "check the data room coverage", "flag empty
  folders", "what haven't we uploaded yet", "data room readiness check", "find gaps
  before we go live", "are all the contracts in there", "check we have everything",
  or any request to assess completeness of the data room by section. Use this skill
  proactively whenever a deal team is preparing to launch a data room and wants to
  know what still needs to be uploaded or organised.
  Do not use for document quality issues such as PII or redaction (use document-quality-check),
  or for drafting Q&A responses (use bulk-qa-answers).
metadata:
  author: Blueflame AI
  version: 1.0.0
  mcp-server: datasite
  category: deal-management
  tags: [datasite, vdr, m&a, gap-analysis, completeness, blueflame]
---

# Data Room Gap Analysis

You are helping a sell-side deal team identify what is missing, incomplete, or sparse in their Datasite data room before buyers get access. You produce two outputs: an HTML gap dashboard for team meetings and an Excel gap register for tracking remediation.

---

## Terminology — fileroom vs. folder

Use these terms precisely when communicating with the user:

- **Fileroom** — the single top-level container inside a Datasite project. A project typically has one buyer-facing fileroom. It is not a subject area — it is the container that holds all subject areas.
- **Folder** — everything inside the fileroom: the subject areas (Financial, Legal, HR, Tax, IP, etc.) and all sub-levels beneath them. Always call these folders, never filerooms.

When in doubt: if it is not the single top-level container for the whole project, it is a folder.


## Feature Requirements

| Capability | Free | Requires Blueflame |
|---|:---:|:---:|
| Structural gap analysis (missing/empty/sparse folders) | ✅ | — |
| Year completeness checks from filenames | ✅ | — |
| Contract cross-referencing (customer, employee, supplier lists) | — | ✅ |
| IRL matching (if provided) | — | ✅ |

**Without Blueflame:** Produces a structural gap report — missing sections, empty folders, sparse time-series coverage — based on folder structure and filenames. Contract cross-referencing and IRL matching are skipped.

**With Blueflame:** `searchDocuments` finds customer, employee, and supplier lists inside documents and cross-references them against the contracts folders to identify missing agreements.



> ⚠️ **Blueflame content guard — two-tier behaviour**
> `searchDocuments` is the only permitted source of document content.
> - Do **not** use Claude's training knowledge, general M&A knowledge, or inference from file names for any findings.
> - **Steps 1–4** (structural gap analysis) use `listFolderContents` only — always free. Complete these regardless of Blueflame status.
> - **Step 5** (contract cross-referencing) requires `searchDocuments`. When you reach it, attempt one call. If it returns an **activation link** instead of results, **do not discard the structural findings already computed**. Present Steps 1–4 results first, then say:
>
>   > "I've completed the structural gap analysis. Summary: [list top findings per section in plain text — e.g. 'Finance: FY2023 audited accounts missing', 'Legal: litigation schedule absent']. To also cross-reference your customer, employee, and vendor lists against contracts, Blueflame AI search needs to be activated:
>   > 🔗 **Activate Blueflame:** [activation link]
>   > **With Blueflame:** I'll read your lists, extract each name, and check whether a signed contract exists — identifying missing or partial coverage.
>   > Would you like to activate now, or shall I produce the gap report dashboard with structural findings only?"
>
> **Do not generate the HTML dashboard or Excel output until after the user responds to this question.**
> - All content findings **must** be sourced exclusively from tool results.

> **`listFolderContents` — efficient traversal**
> - `depth: 1` (default) — immediate children only. Use for targeted lookups.
> - `depth: 5, foldersOnly: true` (default when depth > 1) — full folder tree in one call, no documents. Use for structural checks.
> - `depth: 5, foldersOnly: false` — full folder tree including all document metadata in one call. Use when building a document inventory.
> - When `depth > 1`, the response is a **flat list** with `depth` and `path` columns — not a nested tree.

## Step 1 — Orient yourself

Call `getProjectOverview` to understand the deal: company name, sector, transaction type, deal size, and the fileroom structure. This context shapes what "complete" looks like — a SaaS M&A deal needs different coverage than a manufacturing PE deal.

Note the `transactionValue` and `useCase` — these determine:
- How many years of financials to expect (3 for standard M&A, 5 for large-cap, 2 for early-stage VC)
- Which sections are mandatory vs. deal-specific
- How deep the expected folder coverage should be

---

## Step 2 — Check for an Information Request List (IRL)

Ask the user (briefly, in one line): "Do you have an Information Request List you'd like me to cross-reference? If so, share it and I'll flag what's been delivered vs. outstanding."

If they provide one, read it and extract each requested item. Track these separately — you'll use them in Step 4 to produce a delivered/outstanding view alongside the structural gap analysis.

If they don't have one, proceed with the structural analysis only.

---

## Step 3 — Full structural crawl of the data room

Use `listFolderContents` to walk the entire data room, top to bottom. For each fileroom and folder, note:

- **Folder path** (the full VDR index and name, e.g. `3.2 Audited Financial Statements`)
- **Document count** — how many files are inside
- **Status**:
  - ✓ **Populated** — contains at least the expected number of documents
  - ⚠ **Sparse** — folder exists but has fewer documents than expected for its purpose (e.g. a "Board Minutes" folder with only 1 document when 3 years of minutes are expected)
  - ✗ **Empty** — folder exists but contains no documents at all
  - ✗ **Missing** — an expected section is entirely absent from the data room structure

Use `listFolderContents` to drill into specific folders where you need a precise document count or list of filenames.

### What counts as "sparse"

Apply judgment based on what the folder is for:
- **Audited financials** — expect one document per financial year in scope. Two files where three years are expected = Sparse.
- **Management accounts** — expect monthly or quarterly files for at least the last 12–24 months. A single file = Sparse.
- **Tax returns** — expect one return per jurisdiction per year. Missing a year or jurisdiction = Sparse/Missing.
- **Board minutes** — expect multiple entries per year for at least the last 3 years. One document = Sparse.
- **Contracts folders** — see Step 4 for cross-referencing logic.
- **Single-document folders** (e.g. "Certificate of Incorporation") — one document is fine.

### Expected sections by deal type

Compare the actual data room structure against what a deal of this type should contain. Flag any top-level sections that are entirely absent:

**Always expected (any M&A/PE deal):**
- General Information / Corporate (org charts, articles, board minutes, cap table)
- Finance (audited accounts, management accounts, financial model)
- Tax (filed returns, correspondence)
- Legal (litigation schedule, material contracts)
- HR / Employment (employee list, key employment agreements)
- IP (ownership documentation, if relevant to the business)

**Expected based on sector:**
- Technology / SaaS → IP & Software section (open-source inventory, IP assignments, software licence list)
- Healthcare → Regulatory & Clinical section (licences, CQC/FDA filings)
- Manufacturing → Plant & Equipment, Environmental sections
- Financial Services → Regulatory Capital, Client Money sections

**Expected based on transaction type:**
- M&A sell-side → Closing Documents section
- Carve-out → Transition Services Agreement section
- Capital raise → Investor Presentations, Cap Table History

---

## Step 4 — Year completeness checks

For any folder containing time-series documents (financials, tax returns, management accounts, board minutes), verify year coverage explicitly.

Based on the deal profile:
- **Last closed financial year = today’s year − 1.** The current calendar year is never closed. In 2026 the last closed year is FY2025; in 2027 it will be FY2026.
- Standard M&A (mid-market and below) → expect **3 years**: FY[last_closed − 2], FY[last_closed − 1], FY[last_closed] — e.g. in 2026: FY2023, FY2024, FY2025
- Large-cap (>$500M) → expect **5 years**: FY[last_closed − 4] through FY[last_closed] — e.g. in 2026: FY2021–FY2025
- Early-stage VC → expect 2 years or inception-to-date

For each time-series folder, list which years are present and which are missing. Example:

> "Audited Financial Statements — FY2023 ✓, FY2024 ✓, FY2025 ✗ Missing"
> "Tax Returns — FY2023 ✓, FY2024 ✗ Missing, FY2025 ✗ Missing"

Use document filenames (visible via `listFolderContents`) to infer which year each document covers. If filenames are unclear, note it as "year unclear — review needed."

---

## Step 5 — Contract completeness cross-referencing

If the data room contains any of the following lists, cross-reference them against the corresponding contracts folder. Use `searchDocuments` to locate the list documents, then read their contents to extract names.

### Customer / client list → Customer contracts

1. Find the customer list using `searchDocuments` with query "customer list" or "client list"
2. Extract customer/client names from the document
3. Search the contracts section for each customer name using `searchDocuments`
4. Flag any customer where no corresponding contract is found

Report as: "Contract missing for: [Customer Name]" — sorted by likely revenue importance if discernible from the list.

### Employee list → Employment agreements

1. Find the employee list using `searchDocuments` with query "employee list" or "staff list"
2. Extract names, particularly senior employees (directors, C-suite, managers)
3. Search the HR/Employment agreements folder for each name using `searchDocuments`
4. Flag any senior employee where no employment agreement is found

Focus on senior staff — it is not always expected that every employee has an individual agreement (e.g. employees on standard terms), but directors, C-suite, and named key staff should each have one.

Report as: "Employment agreement not found for: [Name], [Title]"

### Supplier / vendor list → Supplier agreements

1. Find the supplier/vendor list using `searchDocuments` with query "supplier list" or "vendor list"
2. Extract key supplier names (focus on material suppliers, not every minor vendor)
3. Search the contracts/supplier agreements folder for each name
4. Flag material suppliers where no agreement is found

Report as: "Supplier agreement missing for: [Supplier Name]"

---

## Step 6 — IRL cross-reference (if provided)

If the user provided an Information Request List:

For each IRL item, determine its status:
- **Delivered** — a document matching the request exists in the data room (use `searchDocuments` to find it); include the VDR path
- **Partially delivered** — some but not all of what was requested is present (e.g. 2 of 3 requested years)
- **Outstanding** — nothing matching the request found in the data room

Present this as a separate table: IRL Item | Status | VDR Location (if delivered) | Gap Description (if outstanding)

---

## Step 7 — Compile all findings

Compile findings into three categories:

**Category 1 — Structural gaps** (missing or empty sections)
```
{ area, folder_path, status: "Missing"|"Empty", severity, note }
```

**Category 2 — Sparse or incomplete sections**
```
{ area, folder_path, status: "Sparse", detail, severity }
```
For example: "Board Minutes — only 1 document found; expect 3 years of minutes"

**Category 3 — Contract gaps** (from cross-referencing)
```
{ type: "Customer"|"Employee"|"Supplier", name, gap_detail, severity }
```

**Severity:**
- **High** — a buyer will immediately notice and flag this (missing financials, empty legal section, no employment agreements for directors)
- **Medium** — material gap that will be raised in diligence but may be explainable (missing one year of management accounts, a minor supplier contract absent)
- **Low** — minor gap unlikely to be deal-critical (a supporting document absent from an otherwise well-populated folder)

---

## Step 8 — Offer outputs

Before generating any output, ask:

> "I've completed the gap analysis. What would you like me to produce?
> - **HTML dashboard** — interactive gap report with section cards, financial year grid, and Excel export button (uses additional credits to render)
> - **Plain text summary** — gap findings listed in this conversation, no additional cost
> - **Both**"

Only generate the HTML dashboard and/or Excel tracker if the user explicitly requests them. If they choose plain text, go directly to Step 9.

## Step 8b — Produce the HTML dashboard (only if requested)

Generate a self-contained HTML artifact with the following structure. Include a **"Download as Excel"** button in the header that exports all gap data client-side using SheetJS (`https://cdnjs.cloudflare.com/ajax/libs/xlsx/0.18.5/xlsx.full.min.js`). The exported file should match this structure:

**Excel columns (exported on button click):**
1. **Area / Workstream** — e.g. Finance, Tax, Legal, HR, IP
2. **Folder / Item** — VDR path or contract name
3. **Gap Type** — Missing Section / Empty Folder / Sparse / Year Gap / Contract Missing
4. **Severity** — High / Medium / Low
5. **Detail** — specific description of the gap
6. **Recommended Action** — what needs to be uploaded or resolved
7. **Status** — Open (default)

Excel formatting applied via SheetJS: header row dark blue (`#1a2332`) with white bold text, severity colour coding (High = red, Medium = amber, Low = grey), section separator rows per workstream.

The dashboard itself:

**Header bar:**
- Deal name, date of analysis, summary counts: [X] High gaps, [Y] Medium gaps, [Z] Low gaps

**Section scorecard:**
- One tile per workstream (Finance, Tax, Legal, HR, Commercial, IP, ESG, etc.)
- Each tile shows: workstream name, gap counts by severity, RAG status:
  - Red border: any High gap
  - Amber border: Medium gaps only
  - Green border: no gaps found

**Year coverage matrix:**
- A grid showing financial years (columns) vs. document types (rows): Audited Accounts, Management Accounts, Tax Returns, Board Minutes
- Each cell: ✓ (green) present, ✗ (red) missing, ? (grey) unclear

**Contract completeness summary:**
- Customer contracts: X of Y found (progress bar)
- Employment agreements: X of Y found (progress bar)
- Supplier agreements: X of Y found (progress bar)
- Below each bar: list of names where no contract was found

**IRL tracker (if IRL was provided):**
- Delivered / Partial / Outstanding counts as stat cards
- Table: IRL Item | Status chip | VDR Location or Gap Note

**Detailed gap list:**
- Filterable by severity and workstream
- Each row: severity badge, folder path, gap type, detail

**Design:** white background, dark headings, navy/amber/red/green palette, 12px border-radius cards, no external dependencies.

---

## Step 9 — Deliver to the user

Present the dashboard and give a brief summary:

> "I've analysed [N] folders across [M] sections and found [X] High, [Y] Medium, and [Z] Low gaps. The most critical areas are [list top 3]. [If contract cross-referencing ran:] I also cross-referenced [P] customers, [Q] employees, and [R] suppliers — [S] contracts are missing. Use the Download button in the dashboard to export the full gap register as Excel."

Then offer:
> "Want me to prioritise the remediation list so the team knows what to tackle first before going live?"

---

## Operating principles

**Be specific, not vague.** "The Legal section is sparse" is not useful. "Legal / Litigation Schedule — folder is empty; no pending claims schedule found" tells the team exactly what to upload.

**Use filenames to infer content.** Document names in the data room usually reveal what's inside (e.g. "FY2024 Audited Accounts.pdf"). Use them to determine year coverage and document type without needing to open every file.

**Calibrate to deal type.** Missing board minutes matter far more in a PE deal with a complex governance story than in a simple asset sale. Adjust severity accordingly.

**Don't penalise intentional omissions.** Some folders may be empty by design (e.g. a "Closing Documents" folder at the start of a process). If the folder name suggests it's a placeholder for future content, note it as "pending — expected later in process" rather than flagging it as a critical gap.

**Cross-referencing is best-effort.** Customer and employee lists may not always be present or clearly named. If you can't find a list to cross-reference against, say so rather than skipping the check silently.

## Performance Notes

- Work through every section systematically. A missed gap is worse than a false positive — the deal team is relying on this to prepare before buyers get access.
- Use filenames to infer year coverage rather than opening every document.
- Be specific: "Legal / Litigation Schedule — folder is empty" is useful; "the Legal section looks thin" is not.

---

## Common Issues

**`getProjectOverview` fails or returns the wrong project**
Check that the Datasite MCP connector is connected (Settings → Extensions → Datasite should show "Connected"). If you have multiple projects open, confirm with the user which project to use.

**`listFolderContents` returns no results**
The fileroom may be empty or unpublished. Re-run `listFolderContents` without a `metadataId` to list all filerooms from the root. If a fileroom exists but shows 0 documents, the content may not yet be published — note this to the user and proceed with what is available.

**`searchDocuments` returns an activation link instead of results**
Blueflame AI search is not yet active on this project. Follow the Blueflame prompt in the skill instructions above. Do not attempt to answer using Claude's training knowledge.

**MCP disconnects mid-workflow**
Reconnect via Settings → Extensions → Datasite. Resume from the last completed step — results already gathered do not need to be re-fetched.

**`updateContent` or `createContent` returns a permissions error**
The user's Datasite account may not have Editor permissions on this project. Ask them to check their role in Datasite project settings.

More from openai/plugins

SkillDescription
accessibility-and-inclusive-visualizationMake data visualizations accessible and inclusive. Use when the user needs chart or diagram accessibility guidance, text alternatives for complex visuals, color and contrast review, keyboard support, reduced-motion behavior for animation or parallax, or an accessibility QA workflow for exported figures, UML-like diagrams, and dashboards.
agent-browserBrowser automation CLI for AI agents. Use when the user needs to interact with websites, verify dev server output, test web apps, navigate pages, fill forms, click buttons, take screenshots, extract data, or automate any browser task. Also triggers when a dev server starts so you can verify it visually.
agent-browser-verifyAutomated browser verification for dev servers. Triggers when a dev server starts to run a visual gut-check with agent-browser — verifies the page loads, checks for console errors, validates key UI elements, and reports pass/fail before continuing.
agents-sdkBuild AI agents on Cloudflare Workers using the Agents SDK. Load when creating stateful agents, durable workflows, real-time WebSocket apps, scheduled tasks, MCP servers, or chat applications. Covers Agent class, state management, callable RPC, Workflows integration, and React hooks. Biases towards retrieval from Cloudflare docs over pre-trained knowledge.
ai-elementsAI Elements component library guidance — pre-built React components for AI interfaces built on shadcn/ui. Use when building chat UIs, message displays, tool call rendering, streaming responses, reasoning panels, or any AI-native interface with the AI SDK.
ai-gatewayVercel AI Gateway expert guidance. Use when configuring model routing, provider failover, cost tracking, or managing multiple AI providers through a unified API.
ai-generation-persistenceAI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
ai-sdkVercel AI SDK expert guidance. Use when building AI-powered features — chat interfaces, text generation, structured output, tool calling, agents, MCP integration, streaming, embeddings, reranking, image generation, or working with any LLM provider.
aiq-deploy|
aiq-research|