voice-setup
$
npx mdskill add vellum-ai/vellum-assistant/voice-setupYou are helping the user set up and troubleshoot voice features (push-to-talk, text-to-speech) entirely within this conversation. Do NOT direct the user to the Settings page for initial setup - handle everything in-chat using the tools below.
SKILL.md
.github/skills/voice-setupView on GitHub ↗
---
name: voice-setup
description: Complete voice configuration in chat - PTT key, microphone permissions, ElevenLabs TTS, and troubleshooting
compatibility: "Designed for Vellum personal assistants"
metadata:
icon: assets/icon.svg
emoji: "🎙️"
vellum:
category: "voice"
display-name: "Voice Setup"
includes: ["elevenlabs-voice"]
activation-hints:
- "Guided setup or troubleshooting (walkthrough, PTT not working, mic issues, ElevenLabs/TTS)"
- "Simple voice setting changes (PTT key, wake word) -> use voice_config_update directly"
avoid-when:
- 'If "voice" is in a Twilio/phone context, load phone-calls instead'
---
You are helping the user set up and troubleshoot voice features (push-to-talk, text-to-speech) entirely within this conversation. Do NOT direct the user to the Settings page for initial setup - handle everything in-chat using the tools below.
## Available Tools
- `voice_config_update` - Change any voice setting (PTT key, conversation timeout, TTS voice ID)
- `open_system_settings` - Open macOS System Settings to a specific privacy pane
- `navigate_settings_tab` - Open the Vellum settings panel to the Voice tab
- `credential_store` - Collect API keys securely (for ElevenLabs TTS)
## Setup Flow
Walk the user through each section in order. Skip sections they don't need. Ask before proceeding to the next section.
### 1. Microphone Permission
Check `<channel_capabilities>` for `microphone_permission_granted`.
**If `false` or missing:**
1. Explain that macOS requires microphone permission for voice features.
2. Use `open_system_settings` with `pane: "microphone"` to open the right System Settings pane.
3. Tell the user: "I've opened System Settings to the Microphone section. Please toggle **Vellum Assistant** on, then come back here."
4. After they confirm, verify by checking capabilities on the next turn.
**If `true`:** Tell them microphone is already granted and move on.
### 2. Push-to-Talk Activation Key
Present common PTT key options:
- **Right Option** - Default, good general choice
- **Fn** - Dedicated key on most Mac keyboards
- **Right Command** - Easy to reach
- **Right Control** - Familiar from gaming
Ask which key they prefer, then use `voice_config_update` with `setting: "activation_key"` and the chosen value.
**Common issues to mention:**
- If they pick a key that conflicts with their emoji picker (Fn or Globe on newer Macs), warn them and suggest an alternative.
- If they use a terminal app heavily, warn that some keys may be captured by the terminal.
### 3. Text-to-Speech / ElevenLabs (Optional)
Ask if they want high-quality text-to-speech voices via ElevenLabs (optional - standard TTS works without it).
If yes, the included **ElevenLabs Voice** skill (automatically appended below via `includes`) provides the full setup flow: curated voice list, API key collection, advanced voice selection, and tuning parameters. Follow the instructions there.
Note: The config key `services.tts.providers.elevenlabs.voiceId` controls the voice for both in-app TTS and phone calls. If the user sets up phone calls later, they will automatically use the same voice for a consistent experience.
### 4. Verification
After setup is complete:
1. Summarize what was configured.
2. Suggest they test by pressing their PTT key and speaking.
3. Offer to open the Voice settings tab if they want to review: use `navigate_settings_tab` with `tab: "Voice"`.
## Troubleshooting Decision Trees
When the user reports a problem, follow the appropriate decision tree:
### "PTT isn't working" / "Can't record"
1. **Microphone permission** - Check `microphone_permission_granted` in capabilities. If false, guide through granting it.
2. **Key check** - Ask what key they're using. Confirm it matches their configured PTT key.
3. **Emoji picker conflict** - On newer Macs, Fn/Globe opens the emoji picker. If they're using Fn, suggest switching to Right Option or Right Command.
4. **Speech Recognition permission** - Some voice features need this. Use `open_system_settings` with `pane: "speech_recognition"`.
5. **App focus** - PTT may not work when Vellum is not the frontmost app or if another app has captured the key.
### "Recording but no text" / "Transcription not working"
1. **Speech Recognition permission** - Must be granted for transcription.
2. **Microphone input** - Ask if they see the recording indicator. If yes, the mic works but transcription is failing.
3. **Locale/language** - Speech recognition works best with the system language. Ask if they're speaking in a different language.
4. **Background noise** - Excessive noise can prevent transcription. Suggest a quieter environment or a closer microphone.
### "Changed a setting but it didn't work"
1. **Event broadcast** - The setting should take effect immediately. If it didn't, suggest restarting the assistant.
2. **Verify** - Open the Voice settings tab with `navigate_settings_tab` to confirm the setting was persisted.
## Deep Debugging
For persistent issues, suggest checking system logs:
```bash
log stream --predicate 'subsystem == "com.vellum.assistant"' --level debug
```
Key log categories:
- `voice` - PTT activation, recording state
- `speech` - Speech recognition results
## Rules
- Always handle setup conversationally in-chat. Do NOT tell the user to go to Settings for initial configuration.
- Use `navigate_settings_tab` only for review/verification after in-chat setup, not as the primary setup method.
- Be concise. Don't explain every option exhaustively - present the most common choices and let the user ask for more.
- If a permission is denied, acknowledge it gracefully and explain what features won't work without it.