aws-cloud-monitoring

Name: aws-cloud-monitoring
Author: automateyournetwork/netclaw

$npx mdskill add automateyournetwork/netclaw/aws-cloud-monitoring

Monitors AWS CloudWatch metrics, alarms, logs, and network performance

Analyzes network latency, VPC flow logs, and CloudWatch alarms
Uses AWS CloudWatch, CloudWatch Logs, and VPC flow log APIs
Checks metrics for EC2, ELB, NAT Gateway, and Transit Gateway
Delivers dashboards and alerts for network health and performance

SKILL.md

.github/skills/aws-cloud-monitoringView on GitHub ↗

---
name: aws-cloud-monitoring
description: "AWS CloudWatch monitoring — metrics, alarms, log queries, VPC flow log analysis, network performance. Use when checking AWS alarms, analyzing VPC flow logs, investigating network latency, or monitoring VPN and NAT Gateway metrics."
version: 1.0.0
license: Apache-2.0
tags: [aws, cloudwatch, monitoring, metrics, alarms, logs, flow-logs]
---

# AWS Cloud Monitoring

## MCP Server

- **Command**: `uvx awslabs.cloudwatch-mcp-server@latest` (stdio transport)
- **Requires**: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION` (or `AWS_PROFILE`)

## Key Capabilities

- **Metrics**: Query CloudWatch metrics for any AWS service (EC2, ELB, TGW, NAT GW, VPN)
- **Alarms**: List and inspect CloudWatch alarms and their states
- **Logs**: Run CloudWatch Logs Insights queries across any log group
- **Flow Logs**: Analyze VPC and TGW flow logs for traffic patterns and dropped connections

## Workflow: Network Monitoring Dashboard

When a user asks "how is our AWS network performing?":

1. **Check alarms**: List CloudWatch alarms in ALARM state
2. **VPN metrics**: Tunnel state, bytes in/out for site-to-site VPNs
3. **NAT Gateway metrics**: Active connections, packets dropped, bytes processed
4. **Transit Gateway metrics**: Bytes in/out, packets dropped per attachment
5. **ELB metrics**: Healthy/unhealthy targets, latency, 5xx errors
6. **Report**: Network health dashboard with any issues flagged

## Workflow: Flow Log Analysis

When investigating traffic patterns or security events:

1. **Query VPC flow logs**: Filter by source IP, destination IP, port, action (ACCEPT/REJECT)
2. **Identify rejected traffic**: Find REJECT entries to see blocked connections
3. **Top talkers**: Aggregate by source/destination to find heaviest traffic flows
4. **Time correlation**: Narrow to specific time windows around incidents
5. **Report**: Traffic analysis with recommendations

## Common CloudWatch Network Metrics

| Service | Metric | What It Tells You |
|---------|--------|-------------------|
| VPN | `TunnelState` | 0=down, 1=up for each tunnel |
| VPN | `TunnelDataIn/Out` | Bytes through each VPN tunnel |
| NAT GW | `ActiveConnectionCount` | Active NAT connections |
| NAT GW | `PacketsDropCount` | Packets dropped (capacity issue) |
| NAT GW | `BytesProcessed` | Traffic volume through NAT |
| TGW | `BytesIn/BytesOut` | Traffic per TGW attachment |
| TGW | `PacketDropCountBlackhole` | Blackhole route drops |
| ELB | `HealthyHostCount` | Healthy targets behind ALB/NLB |
| ELB | `TargetResponseTime` | Backend latency |
| EC2 | `NetworkIn/NetworkOut` | Instance network throughput |
| EC2 | `NetworkPacketsIn/Out` | Instance packet rate |

## Flow Log Query Examples

```
# Top rejected connections in last hour
fields @timestamp, srcAddr, dstAddr, dstPort, action
| filter action = "REJECT"
| stats count() as rejections by srcAddr, dstAddr, dstPort
| sort rejections desc
| limit 20

# Traffic from specific source
fields @timestamp, srcAddr, dstAddr, dstPort, bytes, action
| filter srcAddr = "10.0.1.50"
| sort @timestamp desc

# Top talkers by bytes
fields srcAddr, dstAddr, bytes
| stats sum(bytes) as totalBytes by srcAddr, dstAddr
| sort totalBytes desc
| limit 10
```

## Important Rules

- **CloudWatch Logs Insights queries have a cost** — be mindful of time range and data volume
- **Region-specific** — metrics and logs are scoped to the configured region
- **Record in GAIT** — log monitoring investigations for audit trail

## Environment Variables

- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION` (or `AWS_PROFILE`)

More from automateyournetwork/netclaw

Skill	Description
aap-automation	Red Hat Ansible Automation Platform — inventory management, job template execution, project SCM sync, ad-hoc commands, host management, Galaxy content discovery. Use when automating infrastructure with Ansible, running playbooks, managing inventories, or searching for Ansible collections and roles.
aap-eda	Event-Driven Ansible (EDA) — activation lifecycle, rulebook management, decision environments, event stream monitoring. Use when managing event-driven automation triggers, enabling/disabling activations, or reviewing EDA rulebooks.
aap-lint	ansible-lint playbook and role validation — syntax checking, best practice enforcement, project-wide analysis, rule filtering. Use when validating Ansible playbooks, checking code quality, or enforcing automation best practices before deployment.
aci-change-deploy	Safe ACI policy change deployment - ServiceNow CR lifecycle, pre/post-change fault baselines, APIC policy application, automatic rollback on fault delta, and GAIT audit trail. Use when deploying ACI policy changes, creating tenants or EPGs, pushing config to APIC, or running a change window with rollback protection.
aci-fabric-audit	Comprehensive Cisco ACI fabric health audit - node status, tenant/VRF/BD/EPG policy review, contract analysis, fault triage, and endpoint learning verification. Use when auditing ACI fabric health, checking for faults, reviewing tenant policies, or running pre/post-change baselines on APIC.
arista-cvp	Arista CloudVision Portal (CVP) automation via REST API — device inventory, events, connectivity monitoring, tag management (4 tools). Use when managing Arista devices, checking CloudVision events, monitoring network connectivity probes, or tagging devices in CVP.
aruba-cx-config	View and manage Aruba CX switch configurations, perform ISSU upgrades, and firmware operations
aruba-cx-interfaces	Monitor Aruba CX switch interface status, LLDP neighbors, and optical transceiver health
aruba-cx-switching	View and manage Aruba CX switch VLANs and MAC address tables for Layer 2 operations
aruba-cx-system	Discover Aruba CX switch system information, firmware versions, and VSF topology