介绍
# OpenClaw Self-Healing System
> **“能够自我修复的系统,或者在无法修复时寻求帮助。”**
OpenClaw Gateway 的四层自主自愈系统。
## 架构
``` Level 1: Watchdog (180s) → Process monitoring (OpenClaw built-in) Level 2: Health Check (300s) → HTTP 200 + 3 retries Level 3: Claude Recovery → 30min AI-powered diagnosis 🧠 Level 4: Discord Alert → Human escalation ```
## 新特性 (v2.0)
- **世界首创** Claude Code 作为三级急诊医生 - **持久化学习** - 自动生成恢复文档(症状 → 原因 → 解决方案 → 预防) - **推理日志** - 可解释的 AI 决策过程 - **多渠道告警** - Discord + Telegram 支持 - **指标仪表板** - 成功率、恢复时间、趋势分析 - 生产环境测试(2026 年 2 月 5-6 日验证恢复) - macOS LaunchAgent 集成
## 快速开始
### 1. 安装依赖
```bash brew install tmux npm install -g @anthropic-ai/claude-code ```
### 2. 配置环境
```bash # Copy template to OpenClaw config directory cp .env.example ~/.openclaw/.env
# Edit and add your Discord webhook (optional) nano ~/.openclaw/.env ```
### 3. 安装脚本
```bash # Copy scripts cp scripts/*.sh ~/openclaw/scripts/ chmod +x ~/openclaw/scripts/*.sh
# Install LaunchAgent cp launchagent/com.openclaw.healthcheck.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/com.openclaw.healthcheck.plist ```
### 4. 验证
```bash # Check Health Check is running launchctl list | grep openclaw.healthcheck
# View logs tail -f ~/openclaw/memory/healthcheck-$(date +%Y-%m-%d).log ```
## 脚本
| 脚本 | 级别 | 描述 | |--------|-------|-------------| | `gateway-healthcheck.sh` | 2 | HTTP 200 检查 + 3 次重试 + 升级处理 | | `emergency-recovery.sh` | 3 | Claude Code PTY 会话,用于 AI 诊断 (v1) | | `emergency-recovery-v2.sh` | 3 | 增强了学习 + 推理日志 (v2) ⭐ | | `emergency-recovery-monitor.sh` | 4 | 失败时发送 Discord/Telegram 通知 | | `metrics-dashboard.sh` | - | 可视化恢复统计信息 (新增) |
## 配置
所有设置均通过 `~/.openclaw/.env` 中的环境变量完成:
| 变量 | 默认值 | 描述 | |----------|---------|-------------| | `DISCORD_WEBHOOK_URL` | (无) | Discord 告警 Webhook | | `OPENCLAW_GATEWAY_URL` | `http://localhost:18789/` | Gateway 健康检查 URL | | `HEALTH_CHECK_MAX_RETRIES` | `3` | 升级处理前的重启尝试次数 | | `EMERGENCY_RECOVERY_TIMEOUT` | `1800` | Claude 恢复超时时间(30 分钟) |
## 测试
### 测试第 2 级(健康检查)
```bash # Run manually bash ~/openclaw/scripts/gateway-healthcheck.sh
# Expected output: # ✅ Gateway healthy ```
### 测试第 3 级(Claude 恢复)
```bash # Inject a config error (backup first!) cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak
# Wait for Health Check to detect and escalate (~8 min) tail -f ~/openclaw/memory/emergency-recovery-*.log ```
## 链接
- **GitHub:** https://github.com/Ramsbaby/openclaw-self-healing - **文档:** https://github.com/Ramsbaby/openclaw-self-healing/tree/main/docs
## 许可证
MIT License - 随意使用。
由 @ramsbaby + Jarvis 构建 🦞