ClawSkills logoClawSkills

Causal Inference

为智能体操作添加因果推理。在具有可观察结果的任何高级操作上触发 - 电子邮件、消息、日历更改、文件操作、API 调用。

介绍

# Causal Inference

一个轻量级的因果层,用于预测行动结果,不是通过模式匹配相关性,而是通过建模干预和反事实。

## 核心不变量

**每个行动都必须能表示为对因果模型的显式干预,并包含预测效果 + 不确定性 + 可证伪的审计轨迹。**

计划必须是*因果有效的*,而不仅仅是看似合理的。

## 何时触发

**在 ANY 高层级行动上触发此技能**,包括但不限于:

| 领域 | 要记录的行动 | |--------|---------------| | **沟通** | 发送邮件,发送消息,回复,跟进,通知,提及 | | **日历** | 创建/移动/取消会议,设置提醒,RSVP | | **任务** | 创建/完成/推迟任务,设置优先级,指派 | | **文件** | 创建/编辑/分享文档,提交代码,部署 | | **社交** | 发布,反应,评论,分享,私信 | | **购买** | 订单,订阅,取消,退款 | | **系统** | 配置变更,权限授予,集成设置 |

在以下情况也触发: - **审查结果** — "那封邮件得到回复了吗?" → 记录结果,更新估算 - **调试失败** — "为什么这没有起作用?" → 追踪因果图 - **回填历史** — "分析我过去的邮件/日历" → 解析日志,重建行动 - **规划** — "我应该现在还是稍后发送?" → 查询因果模型

## 回填:从历史数据引导

不要从零开始。解析现有日志以重建过去的行动 + 结果。

### 邮件回填

```bash # Extract sent emails with reply status gog gmail list --sent --after 2024-01-01 --format json > /tmp/sent_emails.json

# For each sent email, check if reply exists python3 scripts/backfill_email.py /tmp/sent_emails.json ```

### 日历回填

```bash # Extract past events with attendance gog calendar list --after 2024-01-01 --format json > /tmp/events.json

# Reconstruct: did meeting happen? was it moved? attendee count? python3 scripts/backfill_calendar.py /tmp/events.json ```

### 消息回填 (WhatsApp/Discord/Slack)

```bash # Parse message history for send/reply patterns wacli search --after 2024-01-01 --from me --format json > /tmp/wa_sent.json python3 scripts/backfill_messages.py /tmp/wa_sent.json ```

### 通用回填模式

```python # For any historical data source: for record in historical_data: action_event = { "action": infer_action_type(record), "context": extract_context(record), "time": record["timestamp"], "pre_state": reconstruct_pre_state(record), "post_state": extract_post_state(record), "outcome": determine_outcome(record), "backfilled": True # Mark as reconstructed } append_to_log(action_event) ```

## 架构

### A. 行动日志 (必需)

每个执行的行动都会发出一个结构化事件:

```json { "action": "send_followup", "domain": "email", "context": {"recipient_type": "warm_lead", "prior_touches": 2}, "time": "2025-01-26T10:00:00Z", "pre_state": {"days_since_last_contact": 7}, "post_state": {"reply_received": true, "reply_delay_hours": 4}, "outcome": "positive_reply", "outcome_observed_at": "2025-01-26T14:00:00Z", "backfilled": false } ```

存储在 `memory/causal/action_log.jsonl`。

### B. 因果图 (按领域)

每个领域从 10-30 个可观察变量开始。

**邮件领域:** ``` send_time → reply_prob subject_style → open_rate recipient_type → reply_prob followup_count → reply_prob (diminishing) time_since_last → reply_prob ```

**日历领域:** ``` meeting_time → attendance_rate attendee_count → slip_risk conflict_degree → reschedule_prob buffer_time → focus_quality ```

**消息领域:** ``` response_delay → conversation_continuation message_length → response_length time_of_day → response_prob platform → response_delay ```

**任务领域:** ``` due_date_proximity → completion_prob priority_level → completion_speed task_size → deferral_risk context_switches → error_rate ```

将图定义存储在 `memory/causal/graphs/`。

### C. 估算

对于每个"旋钮"(干预变量),估算处理效应:

```python # Pseudo: effect of morning vs evening sends effect = mean(reply_prob | send_time=morning) - mean(reply_prob | send_time=evening) uncertainty = std_error(effect) ```

首先使用简单的回归或倾向匹配。当图是显式的并且需要识别时,进阶到 do-calculus。

### D. 决策策略

在执行行动之前:

1. 识别干预变量 2. 查询因果模型以获取预期的结果分布 3. 计算预期效用 + 不确定性边界 4. 如果不确定性 > 阈值 或 预期危害 > 阈值 → 拒绝或升级给用户 5. 记录预测以供后续验证

## 工作流

### 在每次行动时

``` BEFORE executing: 1. Log pre_state 2. If enough historical data: query model for expected outcome 3. If high uncertainty or risk: confirm with user

AFTER executing: 1. Log action + context + time 2. Set reminder to check outcome (if not immediate)

WHEN outcome observed: 1. Update action log with post_state + outcome 2. Re-estimate treatment effects if enough new data ```

### 规划行动时

``` 1. User request → identify candidate actions 2. For each action: a. Map to intervention(s) on causal graph b. Predict P(outcome | do(action)) c. Estimate uncertainty d. Compute expected utility 3. Rank by expected utility, filter by safety 4. Execute best action, log prediction 5. Observe outcome, update model ```

### 调试失败时

``` 1. Identify failed outcome 2. Trace back through causal graph 3. For each upstream node: a. Was the value as expected? b. Did the causal link hold? 4. Identify broken link(s) 5. Compute minimal intervention set that would have prevented failure 6. Log counterfactual for learning ```

## 快速开始:今天就引导

```bash # 1. Create the infrastructure mkdir -p memory/causal/graphs memory/causal/estimates

# 2. Initialize config cat > memory/causal/config.yaml << 'EOF' domains: - email - calendar - messaging - tasks

thresholds: max_uncertainty: 0.3 min_expected_utility: 0.1

protected_actions: - delete_email - cancel_meeting - send_to_new_contact - financial_transaction EOF

# 3. Backfill one domain (start with email) python3 scripts/backfill_email.py

# 4. Estimate initial effects python3 scripts/estimate_effect.py --treatment send_time --outcome reply_received --values morning,evening ```

## 安全约束

定义需要明确用户批准的"受保护变量":

```yaml protected: - delete_email - cancel_meeting - send_to_new_contact - financial_transaction

thresholds: max_uncertainty: 0.3 # don't act if P(outcome) uncertainty > 30% min_expected_utility: 0.1 # don't act if expected gain < 10% ```

## 文件

- `memory/causal/action_log.jsonl` — 所有带有结果的已记录行动 - `memory/causal/graphs/` — 特定领域的因果图定义 - `memory/causal/estimates/` — 学习到的处理效应 - `memory/causal/config.yaml` — 安全阈值和受保护变量

## 参考

- 参见 `references/do-calculus.md` 了解形式化的干预语义 - 参见 `references/estimation.md` 了解处理效应估算方法

更多产品