Voice Transcribe

介绍

# voice-transcribe

使用 openai 的 gpt-4o-mini-transcribe 模型转录音频文件。

## 何时使用

当收到语音备忘录（尤其是通过 WhatsApp）时，只需运行： ```bash uv run /Users/darin/clawd/skills/voice-transcribe/transcribe <audio-file> ``` 然后根据转录的内容进行回复。

## 修复转录错误

如果 Darin 说某个词转录错了，将其添加到 `vocab.txt`（作为提示）或 `replacements.txt`（作为强制修复）。请参阅以下章节。

## 支持的格式

- mp3, mp4, mpeg, mpga, m4a, wav, webm, ogg, opus

## 示例

```bash # transcribe a voice memo transcribe /tmp/voice-memo.ogg

# pipe to other tools transcribe /tmp/memo.ogg | pbcopy ```

## 设置

1. 将你的 openai api key 添加到 `/Users/darin/clawd/skills/voice-transcribe/.env`： ``` OPENAI_API_KEY=sk-... ```

## 自定义词汇

将单词添加到 `vocab.txt`（每行一个）以帮助模型识别名称/术语： ``` Clawdis Clawdbot ```

## 文本替换

如果模型仍然弄错某些内容，请向 `replacements.txt` 添加替换规则： ``` wrong spelling -> correct spelling ```

## 注意事项

- 假定为英语（无语言检测） - 专门使用 gpt-4o-mini-transcribe 模型 - 根据音频文件的 sha256 进行缓存