Swarm

介绍

# Swarm — Cut Your LLM Costs by 200x

**将您昂贵的模型转变为负担得起的日常主力工具。将枯燥的工作卸载给 Gemini Flash 工作器 —— 并行、批处理、研究 —— 而成本仅为原来的一小部分。**

## 概览

| 30 个任务通过 | 时间 | 成本 | |--------------|------|------| | Opus (顺序) | ~30s | ~$0.50 | | Swarm (并行) | ~1s | ~$0.003 |

## 适用场景

Swarm 非常适用于： - **3 个及以上独立任务**（研究、摘要、对比） - **对比或研究多个主题** - **多个 URL** 需要获取/分析 - **批处理**（文档、实体、事实） - **复杂分析** 需要多重视角 → 使用 chain

## 快速参考

```bash # Check daemon (do this every session) swarm status

# Start if not running swarm start

# Parallel prompts swarm parallel "What is X?" "What is Y?" "What is Z?"

# Research multiple subjects swarm research "OpenAI" "Anthropic" "Mistral" --topic "AI safety"

# Discover capabilities swarm capabilities ```

## 执行模式

### 并行 (v1.0) N 个提示词 → N 个工作器同时运行。最适合独立任务。

```bash swarm parallel "prompt1" "prompt2" "prompt3" ```

### 研究 (v1.1) 多阶段：搜索 → 获取 → 分析。使用 Google Search 增强功能（grounding）。

```bash swarm research "Buildertrend" "Jobber" --topic "pricing 2026" ```

### 链式 (v1.3) — 精炼流水线数据流经多个阶段，每个阶段具有不同的视角/过滤器。阶段按顺序运行；阶段内的任务并行运行。

**阶段模式：** - `parallel` — N 个输入 → N 个工作器（相同视角） - `single` — 合并后的输入 → 1 个工作器 - `fan-out` — 1 个输入 → N 个具有**不同**视角的工作器 - `reduce` — N 个输入 → 1 个综合输出

**自动链 (Auto-chain)** — 描述您的需求，获得最优流水线： ```bash curl -X POST http://localhost:9999/chain/auto \ -d '{"task":"Find business opportunities","data":"...market data...","depth":"standard"}' ```

**手动链：** ```bash swarm chain pipeline.json # or echo '{"stages":[...]}' | swarm chain --stdin ```

**深度预设：** `quick`（2 个阶段），`standard`（4 个），`deep`（6 个），`exhaustive`（8 个）

**内置视角：** 提取器、过滤器、增强器、分析师、综合器、挑战者、优化器、策略师、研究员、评论员

**不执行仅预览：** ```bash curl -X POST http://localhost:9999/chain/preview \ -d '{"task":"...","depth":"standard"}' ```

### 基准测试 (v1.3) 使用 LLM 作为裁判进行评分，在同一任务上对比单次 vs 并行 vs 链式。

```bash curl -X POST http://localhost:9999/benchmark \ -d '{"task":"Analyze X","data":"...","depth":"standard"}' ```

基于 6 个 FLASK 维度的评分：准确性（2x 权重）、深度（1.5x）、完整性、连贯性、可操作性（1.5x）、细微差别。

### 能力发现 (v1.3) 允许编排器发现有哪些可用的执行模式： ```bash swarm capabilities # or curl http://localhost:9999/capabilities ```

## 提示词缓存 (v1.3.2)

LLM 响应的 LRU 缓存。**缓存命中时提速 212 倍**（并行），**链式提速 514 倍**。

- 键由指令 + 输入 + 视角的哈希值生成 - 最多 500 个条目，1 小时 TTL - 跳过 Web 搜索任务（需要新鲜数据） - 守护进程重启时持久化到磁盘 - 单任务绕过：设置 `task.cache = false`

```bash # View cache stats curl http://localhost:9999/cache

# Clear cache curl -X DELETE http://localhost:9999/cache ```

缓存统计显示在 `swarm status` 中。

## 阶段重试 (v1.3.2)

如果链式阶段中的任务失败，仅重试失败的任务（而非整个阶段）。默认：1 次重试。可通过 `phase.retries` 按阶段配置，或通过 `options.stageRetries` 全局配置。

## 成本追踪 (v1.3.1)

所有端点在其 `complete` 事件中返回成本数据： - `session` — 当前守护进程会话总计 - `daily` — 重启后依然持久化，全天累计

```bash swarm status # Shows session + daily cost swarm savings # Monthly savings report ```

## Web 搜索 (v1.1)

工作器通过 Google Search 增强功能（grounding）搜索实时网络（仅限 Gemini，无额外费用）。

```bash # Research uses web search by default swarm research "Subject" --topic "angle"

# Parallel with web search curl -X POST http://localhost:9999/parallel \ -d '{"prompts":["Current price of X?"],"options":{"webSearch":true}}' ```

## JavaScript API

```javascript const { parallel, research } = require('~/clawd/skills/node-scaling/lib'); const { SwarmClient } = require('~/clawd/skills/node-scaling/lib/client');

// Simple parallel const result = await parallel(['prompt1', 'prompt2', 'prompt3']);

// Client with streaming const client = new SwarmClient(); for await (const event of client.parallel(prompts)) { ... } for await (const event of client.research(subjects, topic)) { ... }

// Chain const result = await client.chainSync({ task, data, depth }); ```

## 守护进程管理

```bash swarm start # Start daemon (background) swarm stop # Stop daemon swarm status # Status, cost, cache stats swarm restart # Restart daemon swarm savings # Monthly savings report swarm logs [N] # Last N lines of daemon log ```

## 性能 (v1.3.2)

| 模式 | 任务数 | 时间 | 备注 | |------|-------|------|-------| | 并行 (简单) | 5 | ~700ms | 有效 142ms/任务 | | 并行 (压力) | 10 | ~1.2s | 有效 123ms/任务 | | 链式 (标准) | 5 | ~14s | 3 阶段多视角 | | 链式 (快速) | 2 | ~3s | 2 阶段提取+综合 | | 缓存命中 | 任意 | ~3-5ms | 200-500x 提速 | | 研究 (网络) | 2 | ~15s | Google 增强延迟 |

## 配置

位置：`~/.config/clawdbot/node-scaling.yaml`

```yaml node_scaling: enabled: true limits: max_nodes: 16 max_concurrent_api: 16 provider: name: gemini model: gemini-2.0-flash web_search: enabled: true parallel_default: false cost: max_daily_spend: 10.00 ```

## 故障排除

| 问题 | 解决方案 | |-------|-----| | 守护进程未运行 | `swarm start` | | 无 API 密钥 | 设置 `GEMINI_API_KEY` 或运行 `npm run setup` | | 速率受限 | 在配置中降低 `max_concurrent_api` | | Web 搜索不工作 | 确保提供方为 gemini + web_search.enabled | | 缓存结果陈旧 | `curl -X DELETE http://localhost:9999/cache` | | 链式太慢 | 使用 `depth: "quick"` 或检查上下文大小 |

## 结构化输出 (v1.3.7)

强制 JSON 输出并进行模式验证 —— 结构化任务零解析失败。

```bash # With built-in schema curl -X POST http://localhost:9999/structured \ -d '{"prompt":"Extract entities from: Tim Cook announced iPhone 17","schema":"entities"}'

# With custom schema curl -X POST http://localhost:9999/structured \ -d '{"prompt":"Classify this text","data":"...","schema":{"type":"object","properties":{"category":{"type":"string"}}}}'

# JSON mode (no schema, just force JSON) curl -X POST http://localhost:9999/structured \ -d '{"prompt":"Return a JSON object with name, age, city for a fictional person"}'

# List available schemas curl http://localhost:9999/structured/schemas ```

**内置模式：** `entities`，`summary`，`comparison`，`actions`，`classification`，`qa`

使用 Gemini 原生 `response_mime_type: application/json` + `responseSchema` 保证 JSON 输出。包含对响应的模式验证。

## 多数投票 (v1.3.7)

相同提示词 → N 次并行执行 → 挑选最佳答案。在事实/分析任务中准确率更高。

```bash # Judge strategy (LLM picks best — most reliable) curl -X POST http://localhost:9999/vote \ -d '{"prompt":"What are the key factors in SaaS pricing?","n":3,"strategy":"judge"}'

# Similarity strategy (consensus — zero extra cost) curl -X POST http://localhost:9999/vote \ -d '{"prompt":"What year was Python released?","n":3,"strategy":"similarity"}'

# Longest strategy (heuristic — zero extra cost) curl -X POST http://localhost:9999/vote \ -d '{"prompt":"Explain recursion","n":3,"strategy":"longest"}' ```

**策略：** - `judge` — LLM 对所有候选结果按准确性/完整性/清晰度/可操作性评分，选出优胜者（N+1 次调用） - `similarity` — Jaccard 词集相似度，选出共识答案（N 次调用，零额外成本） - `longest` — 选择最长响应作为彻底性的启发式判断（N 次调用，零额外成本）

**使用时机：** 事实性问题、关键决策，或任何准确度 > 速度的任务。

| 策略 | 调用次数 | 额外成本 | 质量 | |----------|-------|-----------|---------| | similarity | N | $0 | 良好（共识） | | longest | N | $0 | 尚可（启发式） | | judge | N+1 | ~$0.0001 | 最佳（LLM 评分） |

## 自我反思 (v1.3.5)

在链式/骨架输出后的可选评论员通过。评分 5 个维度，低于阈值时自动优化。

```bash # Add reflect:true to any chain or skeleton request curl -X POST http://localhost:9999/chain/auto \ -d '{"task":"Analyze the AI chip market","data":"...","reflect":true}'

curl -X POST http://localhost:9999/skeleton \ -d '{"task":"Write a market analysis","reflect":true}' ```

经证实：将薄弱输出的平均分从 5.0 提升至 7.6。骨架 + 反思得分为 9.4/10。

## 思维骨架 (v1.3.6)

生成大纲 → 并行扩展每个部分 → 合并为连贯文档。最适合长篇内容。

```bash curl -X POST http://localhost:9999/skeleton \ -d '{"task":"Write a comprehensive guide to SaaS pricing","maxSections":6,"reflect":true}' ```

**性能：** 21 秒生成 14,478 字符（675 字符/秒）—— 内容量是链式的 5.1 倍，吞吐量高出 2.9 倍。

| 指标 | 链式 | 思维骨架 | 优胜者 | |--------|-------|---------------------|--------| | 输出大小 | 2,856 字符 | 14,478 字符 | SoT (5.1x) | | 吞吐量 | 234 字符/秒 | 675 字符/秒 | SoT (2.9x) | | 耗时 | 12s | 21s | 链式（更快） | | 质量 (含反思) | ~7-8/10 | 9.4/10 | SoT |

**使用场景指南：** - **SoT** → 长篇内容、报告、指南、文档（任何具有自然章节的内容） - **Chain** → 分析、研究、对抗性审查（任何需要多重视角的内容） - **Parallel** → 独立任务、批处理 - **Structured** → 实体提取、分类、任何需要可靠 JSON 的任务 - **Voting** → 事实准确性、关键决策、共识构建

## API 端点

| 方法 | 路径 | 描述 | |--------|------|-------------| | GET | /health | 健康检查 | | GET | /status | 详细状态 + 成本 + 缓存 | | GET | /capabilities | 发现执行模式 | | POST | /parallel | 并行执行 N 个提示词 | | POST | /research | 多阶段网络研究 | | POST | /skeleton | 思维骨架（大纲 → 扩展 → 合并）| | POST | /chain | 手动链式流水线 | | POST | /chain/auto | 自动构建 + 执行链式 | | POST | /chain/preview | 预览链式而不执行 | | POST | /chain/template | 执行预构建模板 | | POST | /structured | 强制 JSON 并进行模式验证 | | GET | /structured/schemas | 列出内置模式 | | POST | /vote | 多数投票（N 选优）| | POST | /benchmark | 质量对比测试 | | GET | /templates | 列出链式模板 | | GET | /cache | 缓存统计 | | DELETE | /cache | 清除缓存 |

## 成本对比

| 模型 | 每 1M token 成本 | 相对倍数 | |-------|-------------------|----------| | Claude Opus 4 | ~$15 输入 / $75 输出 | 1x | | GPT-4o | ~$2.50 输入 / $10 输出 | 便宜 ~7x | | Gemini Flash | ~$0.075 输入 / $0.30 输出 | **便宜 200x** |

缓存命中基本上是免费的（~3-5ms，无 API 调用）。

介绍

更多产品

Tavily Web Search

Humanize AI text

Humanizer