it will help you to send voice messages to your AI Assistant and also can make it talk

介绍

# ElevenLabs Speech

完整的语音解决方案——通过一个 API 即可使用 TTS 和 STT： - **TTS**：Text-to-Speech（文本转语音，高质量语音） - **STT**：Speech-to-Text（语音转文本，通过 Scribe 实现精准转录）

## 快速开始

### 环境配置

设置您的 API 密钥： ```bash export ELEVENLABS_API_KEY="sk_..." ```

或者在 workspace 根目录下创建 `.env` 文件。

### Text-to-Speech (TTS)

将文本转换为自然的语音：

```bash python scripts/elevenlabs_speech.py tts -t "Hello world" -o greeting.mp3 ```

使用自定义语音： ```bash python scripts/elevenlabs_speech.py tts -t "Hello" -v "voice_id_here" -o output.mp3 ```

### 列出可用语音

```bash python scripts/elevenlabs_speech.py voices ```

## 在代码中使用

```python from scripts.elevenlabs_speech import ElevenLabsClient

client = ElevenLabsClient(api_key="sk_...")

# Basic TTS result = client.text_to_speech( text="Hello from zerox", output_path="greeting.mp3" )

# With custom settings result = client.text_to_speech( text="Your text here", voice_id="21m00Tcm4TlvDq8ikWAM", # Rachel stability=0.5, similarity_boost=0.75, output_path="output.mp3" )

# Get available voices voices = client.get_voices() for voice in voices['voices']: print(f"{voice['name']}: {voice['voice_id']}") ```

## 热门语音

| 语音 ID | 名称 | 描述 | |----------|------|-------------| | `21m00Tcm4TlvDq8ikWAM` | Rachel | 自然，通用（默认） | | `AZnzlk1XvdvUeBnXmlld` | Domi | 强劲，活力 | | `EXAVITQu4vr4xnSDxMaL` | Bella | 柔和，抚慰 | | `ErXwobaYiN019PkySvjV` | Antoni | 全面均衡 | | `MF3mGyEYCl7XYWbV9V6O` | Elli | 温暖，友好 | | `TxGEqnHWrfWFTfGW9XjX` | Josh | 低沉，冷静 | | `VR6AewLTigWG4xSOukaG` | Arnold | 权威 |

## 语音设置

- **stability** (0-1)：值越低情感越丰富，值越高越稳定 - **similarity_boost** (0-1)：值越高与原始语音越相似

默认值：stability=0.5, similarity_boost=0.75

## 模型

- `eleven_turbo_v2_5` - 快速，高质量（默认） - `eleven_multilingual_v2` - 最适合非英语语言 - `eleven_monolingual_v1` - 仅限英语

## Telegram 集成

当用户发送文本并希望获得语音回复时：

```python # Generate speech result = client.text_to_speech(text=user_text, output_path="reply.mp3")

# Send via Telegram message tool with media path message(action="send", media="path/to/reply.mp3", as_voice=True) ```

## 定价

请访问 https://elevenlabs.io/pricing 查看当前费率。提供免费套餐！

## ElevenLabs Scribe 语音转文本 (STT)

使用 ElevenLabs Scribe 转录语音消息：

### 转录音频

```bash python scripts/elevenlabs_scribe.py voice_message.ogg ```

指定语言： ```bash python scripts/elevenlabs_scribe.py voice_message.ogg --language ara ```

使用说话人分离（多说话人）： ```bash python scripts/elevenlabs_scribe.py voice_message.ogg --speakers 2 ```

### 在代码中使用

```python from scripts.elevenlabs_scribe import ElevenLabsScribe

client = ElevenLabsScribe(api_key="sk-...")

# Basic transcription result = client.transcribe("voice_message.ogg") print(result['text'])

# With language hint (improves accuracy) result = client.transcribe("voice_message.ogg", language_code="ara")

# With speaker detection result = client.transcribe("voice_message.ogg", num_speakers=2) ```

### 支持的格式

- mp3, mp4, mpeg, mpga, m4a, wav, webm - 最大文件大小：100 MB - 与 Telegram 语音消息 (`.ogg`) 配合良好

### 语言支持

Scribe 支持 99 种语言，包括： - 阿拉伯语 (`ara`) - 英语 (`eng`) - 西班牙语 (`spa`) - 法语 (`fra`) - 以及更多...

如果不提供语言提示，它将自动检测。

## 完整工作流示例

**用户发送语音消息 → 您用语音回复：**

```python from scripts.elevenlabs_scribe import ElevenLabsScribe from scripts.elevenlabs_speech import ElevenLabsClient

# 1. Transcribe user's voice message stt = ElevenLabsScribe() transcription = stt.transcribe("user_voice.ogg") user_text = transcription['text']

# 2. Process/understand the text # ... your logic here ...

# 3. Generate response text response_text = "Your response here"

# 4. Convert to speech tts = ElevenLabsClient() tts.text_to_speech(response_text, output_path="reply.mp3")

# 5. Send voice reply message(action="send", media="reply.mp3", as_voice=True) ```

## 定价

请访问 https://elevenlabs.io/pricing 查看当前费率：

**TTS (Text-to-Speech)：** - 免费套餐：每月 10,000 字符 - 提供付费计划

**STT (Speech-to-Text) - Scribe：** - 提供免费套餐 - 请访问网站查看当前定价

it will help you to send voice messages to your AI Assistant and also can make it talk

介绍

更多产品

self-improving-agent

Find Skills

Sonoscli