detecting-insider-data-exfiltration-via-dlp

Name: detecting-insider-data-exfiltration-via-dlp
Author: mukul975/Anthropic-Cybersecurity-Skills

$npx mdskill add mukul975/Anthropic-Cybersecurity-Skills/detecting-insider-data-exfiltration-via-dlp

Detect anomalous data exfiltration using behavioral baselines.

Identifies insider threats through endpoint and cloud log analysis.
Integrates with pandas for statistical anomaly detection.
Executes behavioral analytics on file access and upload patterns.
Generates structured alerts for SOC analysts and threat hunters.

SKILL.md

.github/skills/detecting-insider-data-exfiltration-via-dlpView on GitHub ↗

---
name: detecting-insider-data-exfiltration-via-dlp
description: >
  Detects insider data exfiltration by analyzing DLP policy violations, file access
  patterns, upload volume anomalies, and off-hours activity in endpoint and cloud logs.
  Uses pandas for behavioral analytics and statistical baselines. Use when investigating
  insider threats or building user behavior analytics for data loss prevention.
domain: cybersecurity
subdomain: security-operations
tags: [detecting, insider, data, exfiltration]
version: "1.0"
author: mahipal
license: Apache-2.0
---

# Detecting Insider Data Exfiltration via DLP


## When to Use

- When investigating security incidents that require detecting insider data exfiltration via dlp
- When building detection rules or threat hunting queries for this domain
- When SOC analysts need structured procedures for this analysis type
- When validating security monitoring coverage for related attack techniques

## Prerequisites

- Familiarity with security operations concepts and tools
- Access to a test or lab environment for safe execution
- Python 3.8+ with required dependencies installed
- Appropriate authorization for any testing activities

## Instructions

Analyze endpoint activity logs, cloud storage access, and email DLP events to detect
data exfiltration patterns using behavioral baselines and statistical anomaly detection.

```python
import pandas as pd

df = pd.read_csv("file_activity.csv", parse_dates=["timestamp"])
# Baseline: average daily upload volume per user
baseline = df.groupby(["user", df["timestamp"].dt.date])["bytes_transferred"].sum()
user_avg = baseline.groupby("user").mean()

# Alert on users exceeding 3x their baseline
today = df[df["timestamp"].dt.date == pd.Timestamp.today().date()]
today_totals = today.groupby("user")["bytes_transferred"].sum()
anomalies = today_totals[today_totals > user_avg * 3]
```

Key indicators:
1. Upload volume exceeding 3x daily baseline
2. Access to files outside normal scope
3. Bulk downloads before resignation
4. Off-hours file access patterns
5. USB/external device usage spikes

## Examples

```python
# Detect off-hours activity
df["hour"] = df["timestamp"].dt.hour
off_hours = df[(df["hour"] < 6) | (df["hour"] > 22)]
suspicious = off_hours.groupby("user").size().sort_values(ascending=False)
```

More from mukul975/Anthropic-Cybersecurity-Skills

Skill	Description
acquiring-disk-image-with-dd-and-dcfldd	Create forensically sound bit-for-bit disk images using dd and dcfldd while preserving evidence integrity through hash verification.
analyzing-active-directory-acl-abuse	Detect dangerous ACL misconfigurations in Active Directory using ldap3 to identify GenericAll, WriteDACL, and WriteOwner abuse paths
analyzing-android-malware-with-apktool	Perform static analysis of Android APK malware samples using apktool for decompilation, jadx for Java source recovery, and androguard for permission analysis, manifest inspection, and suspicious API call detection.
analyzing-api-gateway-access-logs	>
analyzing-apt-group-with-mitre-navigator	Analyze advanced persistent threat (APT) group techniques using MITRE ATT&CK Navigator to create layered heatmaps of adversary TTPs for detection gap analysis and threat-informed defense.
analyzing-azure-activity-logs-for-threats	>
analyzing-bootkit-and-rootkit-samples	>
analyzing-browser-forensics-with-hindsight	Analyze Chromium-based browser artifacts using Hindsight to extract browsing history, downloads, cookies, cached content, autofill data, saved passwords, and browser extensions from Chrome, Edge, Brave, and Opera for forensic investigation.
analyzing-campaign-attribution-evidence	Campaign attribution analysis involves systematically evaluating evidence to determine which threat actor or group is responsible for a cyber operation. This skill covers collecting and weighting attr
analyzing-certificate-transparency-for-phishing	Monitor Certificate Transparency logs using crt.sh and Certstream to detect phishing domains, lookalike certificates, and unauthorized certificate issuance targeting your organization.