slack-voice-interface

Name: slack-voice-interface
Author: automateyournetwork/netclaw

$npx mdskill add automateyournetwork/netclaw/slack-voice-interface

Respond to Slack voice clips with text and MP3 voice replies using edge-tts

Enables voice interaction in Slack by replying to voice messages with audio
Uses OpenClaw transcription and edge-tts for text-to-speech conversion
Processes transcribed text with NetClaw tools like pyATS, NetBox, and ServiceNow
Posts text response and uploads generated MP3 file to Slack thread

SKILL.md

.github/skills/slack-voice-interfaceView on GitHub ↗

---
name: slack-voice-interface
description: "Respond to Slack voice clips with both text and an MP3 voice reply using edge-tts. Voice IN is already handled by OpenClaw transcription. Use when a user sends a voice message in Slack, you need to reply with audio, or you want to generate a spoken MP3 response."
license: Apache-2.0
user-invocable: true
metadata:
  { "openclaw": { "requires": { "bins": ["python3"], "env": ["TTS_MCP_SCRIPT", "MCP_CALL"] } } }
---

# Slack Voice Interface

## How It Works

```
User sends voice clip in Slack
    |
    v
OpenClaw transcribes automatically (built-in)
    |
    v
NetClaw processes with full skill set
(pyATS, NetBox, ServiceNow, all 40 MCP servers)
    |
    v
python3 $MCP_CALL "python3 -u $TTS_MCP_SCRIPT" text_to_speech → MP3 file
    |
    v
Upload MP3 to Slack thread + post text response
```

## Voice Response Workflow

### Step 1: Process the question

Treat the transcribed voice message identically to a typed text message.
Use the full NetClaw skill set — pyATS, NetBox, ServiceNow, etc.

### Step 2: Generate voice response

After composing your text response, call `text_to_speech`:

```bash
python3 $MCP_CALL "python3 -u $TTS_MCP_SCRIPT" text_to_speech '{"text":"R1 has 3 OSPF neighbors, all in FULL state on Area 0...","voice":"en-US-GuyNeural"}'
```

This returns JSON with an `output_path` to the generated MP3 file.

To list available voices:

```bash
python3 $MCP_CALL "python3 -u $TTS_MCP_SCRIPT" list_voices '{"language":"en"}'
```

### Step 3: Deliver both text and voice

Post the text response in the Slack thread AND upload the MP3 file:

> :loud_speaker: **Voice Response**
> [MP3 audio file attached]
>
> R1 has 3 OSPF neighbors, all in FULL state on Area 0:
> - 2.2.2.2 (R2) via Gi1 — FULL/DR
> - 3.3.3.3 (R3) via Gi2 — FULL/BDR

**Always deliver text AND voice.** Text is primary (searchable, accessible).
Voice is supplementary.

## Voice Selection

| Voice | Description |
|-------|-------------|
| en-US-GuyNeural | Professional male — **default** |
| en-US-JennyNeural | Professional female |
| en-US-AriaNeural | Conversational female |
| en-GB-RyanNeural | British male |

Users can request a voice change:
- "Switch to a female voice" → use en-US-JennyNeural
- "Use a British accent" → use en-GB-RyanNeural

Call `list_voices` to see all 300+ available voices.

## Performance

| Phase | Latency |
|-------|---------|
| edge-tts synthesis | 1-2 seconds |
| Slack MP3 upload | < 1 second |

Voice synthesis adds minimal overhead to the response time.

## Fallback

If TTS fails, deliver the text response immediately. Do not block on voice.

## Tips for Voice Responses

- **Keep it concise** — under 100 words works best for spoken delivery
- **Avoid tables** — describe data conversationally for voice
- **Spell out abbreviations** — say "OSPF" not "O-S-P-F" (edge-tts handles this)
- **Use natural phrasing** — the text will be read aloud, so write for the ear

## GAIT Integration

Record voice interactions in the GAIT audit trail:

```
Input: Voice clip from @user (transcript: "What are your interfaces?")
Action: Queried R1 interfaces via pyATS
Output: 4 interfaces found — text + voice response delivered to Slack
```

More from automateyournetwork/netclaw

Skill	Description
aap-automation	Red Hat Ansible Automation Platform — inventory management, job template execution, project SCM sync, ad-hoc commands, host management, Galaxy content discovery. Use when automating infrastructure with Ansible, running playbooks, managing inventories, or searching for Ansible collections and roles.
aap-eda	Event-Driven Ansible (EDA) — activation lifecycle, rulebook management, decision environments, event stream monitoring. Use when managing event-driven automation triggers, enabling/disabling activations, or reviewing EDA rulebooks.
aap-lint	ansible-lint playbook and role validation — syntax checking, best practice enforcement, project-wide analysis, rule filtering. Use when validating Ansible playbooks, checking code quality, or enforcing automation best practices before deployment.
aci-change-deploy	Safe ACI policy change deployment - ServiceNow CR lifecycle, pre/post-change fault baselines, APIC policy application, automatic rollback on fault delta, and GAIT audit trail. Use when deploying ACI policy changes, creating tenants or EPGs, pushing config to APIC, or running a change window with rollback protection.
aci-fabric-audit	Comprehensive Cisco ACI fabric health audit - node status, tenant/VRF/BD/EPG policy review, contract analysis, fault triage, and endpoint learning verification. Use when auditing ACI fabric health, checking for faults, reviewing tenant policies, or running pre/post-change baselines on APIC.
arista-cvp	Arista CloudVision Portal (CVP) automation via REST API — device inventory, events, connectivity monitoring, tag management (4 tools). Use when managing Arista devices, checking CloudVision events, monitoring network connectivity probes, or tagging devices in CVP.
aruba-cx-config	View and manage Aruba CX switch configurations, perform ISSU upgrades, and firmware operations
aruba-cx-interfaces	Monitor Aruba CX switch interface status, LLDP neighbors, and optical transceiver health
aruba-cx-switching	View and manage Aruba CX switch VLANs and MAC address tables for Layer 2 operations
aruba-cx-system	Discover Aruba CX switch system information, firmware versions, and VSF topology