Introduction
# Causal Inference
A lightweight causal layer for predicting action outcomes, not by pattern-matching correlations, but by modeling interventions and counterfactuals.
## Core Invariant
**Every action must be representable as an explicit intervention on a causal model, with predicted effects + uncertainty + a falsifiable audit trail.**
Plans must be *causally valid*, not just plausible.
## When to Trigger
**Trigger this skill on ANY high-level action**, including but not limited to:
| Domain | Actions to Log | |--------|---------------| | **Communication** | Send email, send message, reply, follow-up, notification, mention | | **Calendar** | Create/move/cancel meeting, set reminder, RSVP | | **Tasks** | Create/complete/defer task, set priority, assign | | **Files** | Create/edit/share document, commit code, deploy | | **Social** | Post, react, comment, share, DM | | **Purchases** | Order, subscribe, cancel, refund | | **System** | Config change, permission grant, integration setup |
Also trigger when: - **Reviewing outcomes** — "Did that email get a reply?" → log outcome, update estimates - **Debugging failures** — "Why didn't this work?" → trace causal graph - **Backfilling history** — "Analyze my past emails/calendar" → parse logs, reconstruct actions - **Planning** — "Should I send now or later?" → query causal model
## Backfill: Bootstrap from Historical Data
Don't start from zero. Parse existing logs to reconstruct past actions + outcomes.
### Email Backfill
```bash # Extract sent emails with reply status gog gmail list --sent --after 2024-01-01 --format json > /tmp/sent_emails.json
# For each sent email, check if reply exists python3 scripts/backfill_email.py /tmp/sent_emails.json ```
### Calendar Backfill
```bash # Extract past events with attendance gog calendar list --after 2024-01-01 --format json > /tmp/events.json
# Reconstruct: did meeting happen? was it moved? attendee count? python3 scripts/backfill_calendar.py /tmp/events.json ```
### Message Backfill (WhatsApp/Discord/Slack)
```bash # Parse message history for send/reply patterns wacli search --after 2024-01-01 --from me --format json > /tmp/wa_sent.json python3 scripts/backfill_messages.py /tmp/wa_sent.json ```
### Generic Backfill Pattern
```python # For any historical data source: for record in historical_data: action_event = { "action": infer_action_type(record), "context": extract_context(record), "time": record["timestamp"], "pre_state": reconstruct_pre_state(record), "post_state": extract_post_state(record), "outcome": determine_outcome(record), "backfilled": True # Mark as reconstructed } append_to_log(action_event) ```
## Architecture
### A. Action Log (required)
Every executed action emits a structured event:
```json { "action": "send_followup", "domain": "email", "context": {"recipient_type": "warm_lead", "prior_touches": 2}, "time": "2025-01-26T10:00:00Z", "pre_state": {"days_since_last_contact": 7}, "post_state": {"reply_received": true, "reply_delay_hours": 4}, "outcome": "positive_reply", "outcome_observed_at": "2025-01-26T14:00:00Z", "backfilled": false } ```
Store in `memory/causal/action_log.jsonl`.
### B. Causal Graphs (per domain)
Start with 10-30 observable variables per domain.
**Email domain:** ``` send_time → reply_prob subject_style → open_rate recipient_type → reply_prob followup_count → reply_prob (diminishing) time_since_last → reply_prob ```
**Calendar domain:** ``` meeting_time → attendance_rate attendee_count → slip_risk conflict_degree → reschedule_prob buffer_time → focus_quality ```
**Messaging domain:** ``` response_delay → conversation_continuation message_length → response_length time_of_day → response_prob platform → response_delay ```
**Task domain:** ``` due_date_proximity → completion_prob priority_level → completion_speed task_size → deferral_risk context_switches → error_rate ```
Store graph definitions in `memory/causal/graphs/`.
### C. Estimation
For each "knob" (intervention variable), estimate treatment effects:
```python # Pseudo: effect of morning vs evening sends effect = mean(reply_prob | send_time=morning) - mean(reply_prob | send_time=evening) uncertainty = std_error(effect) ```
Use simple regression or propensity matching first. Graduate to do-calculus when graphs are explicit and identification is needed.
### D. Decision Policy
Before executing actions:
1. Identify intervention variable(s) 2. Query causal model for expected outcome distribution 3. Compute expected utility + uncertainty bounds 4. If uncertainty > threshold OR expected harm > threshold → refuse or escalate to user 5. Log prediction for later validation
## Workflow
### On Every Action
``` BEFORE executing: 1. Log pre_state 2. If enough historical data: query model for expected outcome 3. If high uncertainty or risk: confirm with user
AFTER executing: 1. Log action + context + time 2. Set reminder to check outcome (if not immediate)
WHEN outcome observed: 1. Update action log with post_state + outcome 2. Re-estimate treatment effects if enough new data ```
### Planning an Action
``` 1. User request → identify candidate actions 2. For each action: a. Map to intervention(s) on causal graph b. Predict P(outcome | do(action)) c. Estimate uncertainty d. Compute expected utility 3. Rank by expected utility, filter by safety 4. Execute best action, log prediction 5. Observe outcome, update model ```
### Debugging a Failure
``` 1. Identify failed outcome 2. Trace back through causal graph 3. For each upstream node: a. Was the value as expected? b. Did the causal link hold? 4. Identify broken link(s) 5. Compute minimal intervention set that would have prevented failure 6. Log counterfactual for learning ```
## Quick Start: Bootstrap Today
```bash # 1. Create the infrastructure mkdir -p memory/causal/graphs memory/causal/estimates
# 2. Initialize config cat > memory/causal/config.yaml << 'EOF' domains: - email - calendar - messaging - tasks
thresholds: max_uncertainty: 0.3 min_expected_utility: 0.1
protected_actions: - delete_email - cancel_meeting - send_to_new_contact - financial_transaction EOF
# 3. Backfill one domain (start with email) python3 scripts/backfill_email.py
# 4. Estimate initial effects python3 scripts/estimate_effect.py --treatment send_time --outcome reply_received --values morning,evening ```
## Safety Constraints
Define "protected variables" that require explicit user approval:
```yaml protected: - delete_email - cancel_meeting - send_to_new_contact - financial_transaction
thresholds: max_uncertainty: 0.3 # don't act if P(outcome) uncertainty > 30% min_expected_utility: 0.1 # don't act if expected gain < 10% ```
## Files
- `memory/causal/action_log.jsonl` — all logged actions with outcomes - `memory/causal/graphs/` — domain-specific causal graph definitions - `memory/causal/estimates/` — learned treatment effects - `memory/causal/config.yaml` — safety thresholds and protected variables
## References
- See `references/do-calculus.md` for formal intervention semantics - See `references/estimation.md` for treatment effect estimation methods