ClawSkills logoClawSkills

ElevenLabs Voices

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

Introduction

# ElevenLabs Voice Personas v2.1

Comprehensive voice synthesis toolkit using ElevenLabs API.

## πŸš€ First Run - Setup Wizard

When you first use this skill (no `config.json` exists), run the interactive setup wizard:

```bash python3 scripts/setup.py ```

The wizard will guide you through: 1. **API Key** - Enter your ElevenLabs API key (required) 2. **Default Voice** - Choose from popular voices (Rachel, Adam, Bella, etc.) 3. **Language** - Set your preferred language (32 supported) 4. **Audio Quality** - Standard or high quality output 5. **Cost Tracking** - Enable usage and cost monitoring 6. **Budget Limit** - Optional monthly spending cap

**πŸ”’ Privacy:** Your API key is stored locally in `config.json` only. It never leaves your machine and is automatically excluded from git via `.gitignore`.

To reconfigure at any time, simply run the setup wizard again.

---

## ✨ Features

- **18 Voice Personas** - Carefully curated voices for different use cases - **32 Languages** - Multi-language synthesis with the multilingual v2 model - **Streaming Mode** - Real-time audio output as it generates - **Sound Effects (SFX)** - AI-generated sound effects from text prompts - **Batch Processing** - Process multiple texts in one go - **Cost Tracking** - Monitor character usage and estimated costs - **Voice Design** - Create custom voices from descriptions - **Pronunciation Dictionary** - Custom word pronunciation rules - **OpenClaw Integration** - Works with OpenClaw's built-in TTS

---

## πŸŽ™οΈ Available Voices

| Voice | Accent | Gender | Persona | Best For | |-------|--------|--------|---------|----------| | rachel | πŸ‡ΊπŸ‡Έ US | female | warm | Conversations, tutorials | | adam | πŸ‡ΊπŸ‡Έ US | male | narrator | Documentaries, audiobooks | | bella | πŸ‡ΊπŸ‡Έ US | female | professional | Business, presentations | | brian | πŸ‡ΊπŸ‡Έ US | male | comforting | Meditation, calm content | | george | πŸ‡¬πŸ‡§ UK | male | storyteller | Audiobooks, storytelling | | alice | πŸ‡¬πŸ‡§ UK | female | educator | Tutorials, explanations | | callum | πŸ‡ΊπŸ‡Έ US | male | trickster | Playful, gaming | | charlie | πŸ‡¦πŸ‡Ί AU | male | energetic | Sports, motivation | | jessica | πŸ‡ΊπŸ‡Έ US | female | playful | Social media, casual | | lily | πŸ‡¬πŸ‡§ UK | female | actress | Drama, elegant content | | matilda | πŸ‡ΊπŸ‡Έ US | female | professional | Corporate, news | | river | πŸ‡ΊπŸ‡Έ US | neutral | neutral | Inclusive, informative | | roger | πŸ‡ΊπŸ‡Έ US | male | casual | Podcasts, relaxed | | daniel | πŸ‡¬πŸ‡§ UK | male | broadcaster | News, announcements | | eric | πŸ‡ΊπŸ‡Έ US | male | trustworthy | Business, corporate | | chris | πŸ‡ΊπŸ‡Έ US | male | friendly | Tutorials, approachable | | will | πŸ‡ΊπŸ‡Έ US | male | optimist | Motivation, uplifting | | liam | πŸ‡ΊπŸ‡Έ US | male | social | YouTube, social media |

## 🎯 Quick Presets

- `default` β†’ rachel (warm, friendly) - `narrator` β†’ adam (documentaries) - `professional` β†’ matilda (corporate) - `storyteller` β†’ george (audiobooks) - `educator` β†’ alice (tutorials) - `calm` β†’ brian (meditation) - `energetic` β†’ liam (social media) - `trustworthy` β†’ eric (business) - `neutral` β†’ river (inclusive) - `british` β†’ george - `australian` β†’ charlie - `broadcaster` β†’ daniel (news)

---

## 🌍 Supported Languages (32)

The multilingual v2 model supports these languages:

| Code | Language | Code | Language | |------|----------|------|----------| | en | English | pl | Polish | | de | German | nl | Dutch | | es | Spanish | sv | Swedish | | fr | French | da | Danish | | it | Italian | fi | Finnish | | pt | Portuguese | no | Norwegian | | ru | Russian | tr | Turkish | | uk | Ukrainian | cs | Czech | | ja | Japanese | sk | Slovak | | ko | Korean | hu | Hungarian | | zh | Chinese | ro | Romanian | | ar | Arabic | bg | Bulgarian | | hi | Hindi | hr | Croatian | | ta | Tamil | el | Greek | | id | Indonesian | ms | Malay | | vi | Vietnamese | th | Thai |

```bash # Synthesize in German python3 tts.py --text "Guten Tag!" --voice rachel --lang de

# Synthesize in French python3 tts.py --text "Bonjour le monde!" --voice adam --lang fr

# List all languages python3 tts.py --languages ```

---

## πŸ’» CLI Usage

### Basic Text-to-Speech

```bash # List all voices python3 scripts/tts.py --list

# Generate speech python3 scripts/tts.py --text "Hello world" --voice rachel --output hello.mp3

# Use a preset python3 scripts/tts.py --text "Breaking news..." --voice broadcaster --output news.mp3

# Multi-language python3 scripts/tts.py --text "Bonjour!" --voice rachel --lang fr --output french.mp3 ```

### Streaming Mode

Generate audio with real-time streaming (good for long texts):

```bash # Stream audio as it generates python3 scripts/tts.py --text "This is a long story..." --voice adam --stream

# Streaming with custom output python3 scripts/tts.py --text "Chapter one..." --voice george --stream --output chapter1.mp3 ```

### Batch Processing

Process multiple texts from a file:

```bash # From newline-separated text file python3 scripts/tts.py --batch texts.txt --voice rachel --output-dir ./audio

# From JSON file python3 scripts/tts.py --batch batch.json --output-dir ./output ```

**JSON batch format:** ```json [ {"text": "First line", "voice": "rachel", "output": "line1.mp3"}, {"text": "Second line", "voice": "adam", "output": "line2.mp3"}, {"text": "Third line"} ] ```

**Simple text format (one per line):** ``` Hello, this is the first sentence. This is the second sentence. And this is the third. ```

### Usage Statistics

```bash # Show usage stats and cost estimates python3 scripts/tts.py --stats

# Reset statistics python3 scripts/tts.py --reset-stats ```

---

## 🎡 Sound Effects (SFX)

Generate AI-powered sound effects from text descriptions:

```bash # Generate a sound effect python3 scripts/sfx.py --prompt "Thunder rumbling in the distance"

# With specific duration (0.5-22 seconds) python3 scripts/sfx.py --prompt "Cat meowing" --duration 3 --output cat.mp3

# Adjust prompt influence (0.0-1.0) python3 scripts/sfx.py --prompt "Footsteps on gravel" --influence 0.5

# Batch SFX generation python3 scripts/sfx.py --batch sounds.json --output-dir ./sfx

# Show prompt examples python3 scripts/sfx.py --examples ```

**Example prompts:** - "Thunder rumbling in the distance" - "Cat purring contentedly" - "Typing on a mechanical keyboard" - "Spaceship engine humming" - "Coffee shop background chatter"

---

## 🎨 Voice Design

Create custom voices from text descriptions:

```bash # Basic voice design python3 scripts/voice-design.py --gender female --age middle_aged --accent american \ --description "A warm, motherly voice"

# With custom preview text python3 scripts/voice-design.py --gender male --age young --accent british \ --text "Welcome to the adventure!" --output preview.mp3

# Save to your ElevenLabs library python3 scripts/voice-design.py --gender female --age young --accent american \ --description "Energetic podcast host" --save "MyHost"

# List all design options python3 scripts/voice-design.py --options ```

**Voice Design Options:**

| Option | Values | |--------|--------| | Gender | male, female, neutral | | Age | young, middle_aged, old | | Accent | american, british, african, australian, indian, latin, middle_eastern, scandinavian, eastern_european | | Accent Strength | 0.3-2.0 (subtle to strong) |

---

## πŸ“– Pronunciation Dictionary

Customize how words are pronounced:

Edit `pronunciations.json`: ```json { "rules": [ { "word": "OpenClaw", "replacement": "Open Claw", "comment": "Pronounce as two words" }, { "word": "API", "replacement": "A P I", "comment": "Spell out acronym" } ] } ```

Usage: ```bash # Pronunciations are applied automatically python3 scripts/tts.py --text "The OpenClaw API is great" --voice rachel

# Disable pronunciations python3 scripts/tts.py --text "The API is great" --voice rachel --no-pronunciations ```

---

## πŸ’° Cost Tracking

The skill tracks your character usage and estimates costs:

```bash python3 scripts/tts.py --stats ```

**Output:** ``` πŸ“Š ElevenLabs Usage Statistics

Total Characters: 15,230 Total Requests: 42 Since: 2024-01-15

πŸ’° Estimated Costs: Starter $4.57 ($0.30/1k chars) Creator $3.66 ($0.24/1k chars) Pro $2.74 ($0.18/1k chars) Scale $1.68 ($0.11/1k chars) ```

---

## πŸ€– OpenClaw TTS Integration

### Using with OpenClaw's Built-in TTS

OpenClaw has built-in TTS support that can use ElevenLabs. Configure in `~/.openclaw/openclaw.json`:

```json { "tts": { "enabled": true, "provider": "elevenlabs", "elevenlabs": { "apiKey": "your-api-key-here", "voice": "rachel", "model": "eleven_multilingual_v2" } } } ```

### Triggering TTS in Chat

In OpenClaw conversations: - Use `/tts on` to enable automatic TTS - Use the `tts` tool directly for one-off speech - Request "read this aloud" or "speak this"

### Using Skill Scripts from OpenClaw

```bash # OpenClaw can run these scripts directly exec python3 /path/to/skills/elevenlabs-voices/scripts/tts.py --text "Hello" --voice rachel ```

---

## βš™οΈ Configuration

The scripts look for API key in this order:

1. `ELEVEN_API_KEY` or `ELEVENLABS_API_KEY` environment variable 2. OpenClaw config (`~/.openclaw/openclaw.json` β†’ tts.elevenlabs.apiKey) 3. Skill-local `.env` file

**Create .env file:** ```bash echo 'ELEVEN_API_KEY=your-key-here' > .env ```

---

## πŸŽ›οΈ Voice Settings

Each voice has tuned settings for optimal output:

| Setting | Range | Description | |---------|-------|-------------| | stability | 0.0-1.0 | Higher = consistent, lower = expressive | | similarity_boost | 0.0-1.0 | How closely to match original voice | | style | 0.0-1.0 | Exaggeration of speaking style |

---

## πŸ“ Triggers

- "use {voice_name} voice" - "speak as {persona}" - "list voices" - "voice settings" - "generate sound effect" - "design a voice"

---

## πŸ“ Files

``` elevenlabs-voices/ β”œβ”€β”€ SKILL.md # This documentation β”œβ”€β”€ README.md # Quick start guide β”œβ”€β”€ config.json # Your local config (created by setup, in .gitignore) β”œβ”€β”€ voices.json # Voice definitions & settings β”œβ”€β”€ pronunciations.json # Custom pronunciation rules β”œβ”€β”€ examples.md # Detailed usage examples β”œβ”€β”€ scripts/ β”‚ β”œβ”€β”€ setup.py # Interactive setup wizard β”‚ β”œβ”€β”€ tts.py # Main TTS script β”‚ β”œβ”€β”€ sfx.py # Sound effects generator β”‚ └── voice-design.py # Voice design tool └── references/ └── voice-guide.md # Voice selection guide ```

---

## πŸ”— Links

- [ElevenLabs](https://elevenlabs.io) - [API Documentation](https://docs.elevenlabs.io) - [Voice Library](https://elevenlabs.io/voice-library) - [Sound Effects API](https://elevenlabs.io/docs/api-reference/sound-generation) - [Voice Design API](https://elevenlabs.io/docs/api-reference/voice-generation)

---

## πŸ“‹ Changelog

### v2.1.0 - Added interactive setup wizard (`scripts/setup.py`) - Onboarding guides through API key, voice, language, quality, and budget settings - Config stored locally in `config.json` (added to `.gitignore`) - Professional, privacy-focused setup experience

### v2.0.0 - Added 32 language support with `--lang` parameter - Added streaming mode with `--stream` flag - Added sound effects generation (`sfx.py`) - Added batch processing with `--batch` flag - Added cost tracking with `--stats` flag - Added voice design tool (`voice-design.py`) - Added pronunciation dictionary support - Added OpenClaw TTS integration documentation - Improved error handling and progress output

More Products