ClawSkills logoClawSkills

Openclaw

面向 OpenClaw Gateway 的 4 层自治自愈系统,具备持久化学习、推理日志和多通道告警功能。以 Claude Code 作为第 3 层 e

介绍

# OpenClaw Self-Healing System

> **“能够自我修复的系统,或者在无法修复时寻求帮助。”**

OpenClaw Gateway 的四层自主自愈系统。

## 架构

``` Level 1: Watchdog (180s) → Process monitoring (OpenClaw built-in) Level 2: Health Check (300s) → HTTP 200 + 3 retries Level 3: Claude Recovery → 30min AI-powered diagnosis 🧠 Level 4: Discord Alert → Human escalation ```

## 新特性 (v2.0)

- **世界首创** Claude Code 作为三级急诊医生 - **持久化学习** - 自动生成恢复文档(症状 → 原因 → 解决方案 → 预防) - **推理日志** - 可解释的 AI 决策过程 - **多渠道告警** - Discord + Telegram 支持 - **指标仪表板** - 成功率、恢复时间、趋势分析 - 生产环境测试(2026 年 2 月 5-6 日验证恢复) - macOS LaunchAgent 集成

## 快速开始

### 1. 安装依赖

```bash brew install tmux npm install -g @anthropic-ai/claude-code ```

### 2. 配置环境

```bash # Copy template to OpenClaw config directory cp .env.example ~/.openclaw/.env

# Edit and add your Discord webhook (optional) nano ~/.openclaw/.env ```

### 3. 安装脚本

```bash # Copy scripts cp scripts/*.sh ~/openclaw/scripts/ chmod +x ~/openclaw/scripts/*.sh

# Install LaunchAgent cp launchagent/com.openclaw.healthcheck.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/com.openclaw.healthcheck.plist ```

### 4. 验证

```bash # Check Health Check is running launchctl list | grep openclaw.healthcheck

# View logs tail -f ~/openclaw/memory/healthcheck-$(date +%Y-%m-%d).log ```

## 脚本

| 脚本 | 级别 | 描述 | |--------|-------|-------------| | `gateway-healthcheck.sh` | 2 | HTTP 200 检查 + 3 次重试 + 升级处理 | | `emergency-recovery.sh` | 3 | Claude Code PTY 会话,用于 AI 诊断 (v1) | | `emergency-recovery-v2.sh` | 3 | 增强了学习 + 推理日志 (v2) ⭐ | | `emergency-recovery-monitor.sh` | 4 | 失败时发送 Discord/Telegram 通知 | | `metrics-dashboard.sh` | - | 可视化恢复统计信息 (新增) |

## 配置

所有设置均通过 `~/.openclaw/.env` 中的环境变量完成:

| 变量 | 默认值 | 描述 | |----------|---------|-------------| | `DISCORD_WEBHOOK_URL` | (无) | Discord 告警 Webhook | | `OPENCLAW_GATEWAY_URL` | `http://localhost:18789/` | Gateway 健康检查 URL | | `HEALTH_CHECK_MAX_RETRIES` | `3` | 升级处理前的重启尝试次数 | | `EMERGENCY_RECOVERY_TIMEOUT` | `1800` | Claude 恢复超时时间(30 分钟) |

## 测试

### 测试第 2 级(健康检查)

```bash # Run manually bash ~/openclaw/scripts/gateway-healthcheck.sh

# Expected output: # ✅ Gateway healthy ```

### 测试第 3 级(Claude 恢复)

```bash # Inject a config error (backup first!) cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak

# Wait for Health Check to detect and escalate (~8 min) tail -f ~/openclaw/memory/emergency-recovery-*.log ```

## 链接

- **GitHub:** https://github.com/Ramsbaby/openclaw-self-healing - **文档:** https://github.com/Ramsbaby/openclaw-self-healing/tree/main/docs

## 许可证

MIT License - 随意使用。

由 @ramsbaby + Jarvis 构建 🦞

更多产品