Introduction
# Proactive Tasks
A task management system that transforms reactive assistants into proactive partners who work autonomously on shared goals.
## Core Concept
Instead of waiting for your human to tell you what to do, this skill lets you: - Track goals and break them into actionable tasks - Work on tasks during heartbeats - Message your human with updates and ask for input when blocked - Make steady progress on long-term objectives
## Quick Start
### Creating Goals
When your human mentions a goal or project:
```bash python3 scripts/task_manager.py add-goal "Build voice assistant hardware" \ --priority high \ --context "Replace Alexa with custom solution using local models" ```
### Breaking Down into Tasks
```bash python3 scripts/task_manager.py add-task "Build voice assistant hardware" \ "Research voice-to-text models" \ --priority high
python3 scripts/task_manager.py add-task "Build voice assistant hardware" \ "Compare Raspberry Pi vs other hardware options" \ --depends-on "Research voice-to-text models" ```
### During Heartbeats
Check what to work on next:
```bash python3 scripts/task_manager.py next-task ```
This returns the highest-priority task you can work on (no unmet dependencies, not blocked).
### Completing Tasks
```bash python3 scripts/task_manager.py complete-task <task-id> \ --notes "Researched Whisper, Coqui, vosk. Whisper.cpp looks best for Pi." ```
### Messaging Your Human
When you complete something important or get blocked:
```bash python3 scripts/task_manager.py mark-needs-input <task-id> \ --reason "Need budget approval for hardware purchase" ```
Then message your human with the update/question.
## Phase 2: Production-Ready Architecture
Proactive Tasks v1.2.0 includes battle-tested patterns from real agent usage to prevent data loss, survive context truncation, and maintain reliability under autonomous operation.
### 1. WAL Protocol (Write-Ahead Logging)
**The Problem:** Agents write to memory files, then context gets truncated. Changes vanish.
**The Solution:** Log critical changes to `memory/WAL-YYYY-MM-DD.log` BEFORE modifying task data.
**How it works:** - Every `mark-progress`, `log-time`, or status change creates a WAL entry first - If context gets cut mid-operation, the WAL has the details - After compaction, read the WAL to recover what was happening
**Events logged:** - `PROGRESS_CHANGE`: Task progress updates (0-100%) - `TIME_LOG`: Actual time spent on tasks - `STATUS_CHANGE`: Task state transitions (blocked, completed, etc.) - `HEALTH_CHECK`: Self-healing operations
**Automatically enabled** - no configuration needed. WAL files are created in `memory/` directory.
### 2. SESSION-STATE.md (Active Working Memory)
**The Concept:** Chat history is a BUFFER, not storage. SESSION-STATE.md is your "RAM" - the ONLY place task details are reliably preserved.
**Auto-updated on every task operation:** ```markdown ## Current Task - **ID:** task_abc123 - **Title:** Research voice models - **Status:** in_progress - **Progress:** 75% - **Time:** 45 min actual / 60 min estimate (25% faster)
## Next Action Complete research, document findings in notes, mark complete. ```
**Why this matters:** After context compaction, you can read SESSION-STATE.md and immediately know: - What you were working on - How far you got - What to do next
### 3. Working Buffer (Danger Zone Safety)
**The Problem:** Between 60% and 100% context usage, you're in the "danger zone" - compaction could happen any time.
**The Solution:** Automatically append all task updates to `working-buffer.md`.
**How it works:** ```bash # Every progress update, time log, or status change appends: - PROGRESS_CHANGE (2026-02-12T10:30:00Z): task_abc123 → 75% - TIME_LOG (2026-02-12T10:35:00Z): task_abc123 → +15 min - STATUS_CHANGE (2026-02-12T10:40:00Z): task_abc123 → completed ```
**After compaction:** Read `working-buffer.md` to see exactly what happened during the danger zone.
**Manual flush:** `python3 scripts/task_manager.py flush-buffer` to copy buffer contents to daily memory file.
### 4. Self-Healing Health Check
**Agents make mistakes.** Task data can get corrupted over time. The health-check command detects and auto-fixes common issues:
```bash python3 scripts/task_manager.py health-check ```
**Detects 5 categories of issues:**
1. **Orphaned recurring tasks** - No parent goal 2. **Impossible states** - Status=completed but progress < 100% 3. **Missing timestamps** - Completed tasks without `completed_at` 4. **Time anomalies** - Actual time >> estimate (flags for review, doesn't auto-fix) 5. **Future-dated completions** - Completed tasks with future timestamps
**Auto-fixes 4 safe categories** (time anomalies just flagged for human review).
**When to run:** - During heartbeats (every few days) - After recovering from context truncation - When task data seems inconsistent
### Production Reliability
These four patterns work together to create a robust system:
``` User request → WAL log → Update data → Update SESSION-STATE → Append to buffer ↓ ↓ ↓ ↓ ↓ Context cut? → Read WAL → Verify data → Check SESSION-STATE → Review buffer ```
**Result:** You never lose work, even during context truncation. The system self-heals and maintains consistency autonomously.
### 5. Compaction Recovery Protocol
**Trigger:** Session starts with `<summary>` tag, or you're asked "where were we?" or "continue".
**The Problem:** Context was truncated. You don't remember what task you were working on.
**Recovery Steps (in order):**
1. **FIRST:** Read `working-buffer.md` - Raw danger zone exchanges ```bash # Check if buffer exists and has recent content cat working-buffer.md ```
2. **SECOND:** Read `SESSION-STATE.md` - Active task state ```bash # Get current task context cat SESSION-STATE.md ```
3. **THIRD:** Read today's WAL log ```bash # See what operations happened cat memory/WAL-$(date +%Y-%m-%d).log | tail -20 ```
4. **FOURTH:** Check task data for the task ID from SESSION-STATE ```bash python3 scripts/task_manager.py list-tasks "Goal Title" ```
5. **Extract & Update:** Pull important context from buffer into SESSION-STATE if needed
6. **Present Recovery:** "Recovered from compaction. Last task: [title]. Progress: [%]. Next action: [what to do]. Continue?"
**Do NOT ask "what were we discussing?"** - The buffer and SESSION-STATE literally have the answer.
### 6. Verify Before Reporting (VBR)
**The Law:** "Code exists" ≠ "feature works." Never report task completion without end-to-end verification.
**Trigger:** About to mark a task `completed` or say "done":
1. **STOP** - Don't mark complete yet 2. **Test** - Actually run/verify the outcome from user perspective 3. **Verify** - Check the result, not just the output 4. **Document** - Add verification details to task notes 5. **THEN** - Mark complete with confidence
**Examples:**
❌ **Wrong:** "Added health-check command. Task complete!" ✅ **Right:** "Added health-check. Testing... detected 4 issues, auto-fixed 3. Verified on broken test data. Task complete!"
❌ **Wrong:** "Implemented SESSION-STATE updates. Done!" ✅ **Right:** "Implemented SESSION-STATE. Tested with mark-progress, log-time, mark-blocked - all update correctly. Done!"
**Why this matters:** Agents often report completion based on "I wrote the code" rather than "I verified it works." VBR prevents false completions and builds trust.
## Proactive Mindset
**The Core Question:** Don't ask "what should I do?" Ask "what would genuinely help my human that they haven't thought to ask for?"
### Autonomous Task Work
During heartbeats, you have the opportunity to make real progress:
1. **Check for next task** - What's the highest priority work? 2. **Make progress** - Work on it for 10-15 minutes autonomously 3. **Update status** - Track progress, time, blockers honestly 4. **Message when it matters** - Completions, blockers, discoveries (not routine progress)
**The transformation:** From waiting for prompts → making steady autonomous progress on shared goals.
### When to Reach Out
**DO message your human when:** - ✅ Task completed (especially if it unblocks other work) - ✅ Blocked and need input/decision - ✅ Discovered something important they should know - ✅ Need clarification on requirements
**DON'T spam with:** - ❌ Routine progress updates ("now at 50%...") - ❌ Every tiny sub-task completion - ❌ Things they didn't ask about (unless genuinely valuable)
**The goal:** Be a proactive partner who makes things happen, not a chatty assistant who needs constant validation.
## Task States
| State | Meaning | |-------|---------| | `pending` | Ready to work on (all dependencies met) | | `in_progress` | Currently working on it | | `blocked` | Can't proceed (dependencies not met) | | `needs_input` | Waiting for human input/decision | | `completed` | Done! | | `cancelled` | No longer relevant |
## Autonomous Operation (Phase 2)
### Two-Mode Architecture
Proactive Tasks supports two distinct operational modes:
| Mode | Context | Trigger | Best For | Risk | |------|---------|---------|----------|------| | **Interactive (systemEvent)** | Full main session context | User request, manual prompts | Decision-making, human-facing work | Full context available | | **Autonomous (isolated agentTurn)** | No main session context | Heartbeat cron, scheduled background | Velocity reports, cleanup, recurring tasks | May lose context |
### Key Design: Avoid Interruption
**Don't use `systemEvent` for background work.** When a cron job fires during your main session, the prompt gets queued and work doesn't happen. Instead: - Use **heartbeat polling** (every 30 min) for interactive checks + work - Use **isolated agentTurn** (cron subprocess) for pure computation work
This ensures background tasks never interrupt your main conversation.
See **[HEARTBEAT-CONFIG.md](HEARTBEAT-CONFIG.md)** for complete autonomous operation patterns, including: - Heartbeat setup (recommended for most work) - Isolated cron patterns (velocity reports, cleanup) - When to use each pattern - Anti-patterns to avoid
## Heartbeat Integration
To enable autonomous proactive work, you need to set up a heartbeat system. This tells you to periodically check for tasks and work on them.
**Quick setup:** See [HEARTBEAT-CONFIG.md](HEARTBEAT-CONFIG.md) for complete setup instructions and patterns.
**TL;DR:** 1. Create a cron job that sends you a heartbeat message every 30 minutes 2. Add proactive-tasks checks to your `HEARTBEAT.md` 3. You'll automatically check for tasks and work on them without waiting for prompts
### Heartbeat Message Template
Your cron job should send this message every 30 minutes:
``` 💓 Heartbeat check: Read HEARTBEAT.md if it exists (workspace context). Follow it strictly. Do not infer or repeat old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK. ```
### Add to HEARTBEAT.md
Add this to your workspace `HEARTBEAT.md`:
```markdown ## Proactive Tasks (Every heartbeat) 🚀
Check if there's work to do on our goals:
- [ ] Run `python3 skills/proactive-tasks/scripts/task_manager.py next-task` - [ ] If a task is returned, work on it for up to 10-15 minutes - [ ] Update task status when done, blocked, or needs input - [ ] Message your human with meaningful updates (completions, blockers, discoveries) - [ ] Don't spam - only message for significant milestones or when stuck
**Goal:** Make autonomous progress on our shared objectives without waiting for prompts. ```
### What Happens
``` Every 30 minutes: ├─ Heartbeat fires ├─ You read HEARTBEAT.md ├─ Check for next task ├─ If task found → work on it, update status, message human if needed └─ If nothing → reply "HEARTBEAT_OK" (silent) ```
**The transformation:** You go from reactive (waiting for prompts) to proactive (making steady autonomous progress).
## Best Practices
### When to Create Goals
- Long-term projects (building something, learning a topic) - Recurring responsibilities (monitor X, maintain Y) - Exploratory work (research Z, evaluate options for W)
### When to Create Tasks
Break goals into tasks that are: - **Specific**: "Research Whisper models" not "Look into AI stuff" - **Achievable in one sitting**: 15-60 minutes of focused work - **Clear completion criteria**: You know when it's done
### When to Message Your Human
✅ **Do message when:** - You complete a meaningful milestone - You need input/decision to proceed - You discover something important - A task will take longer than expected
❌ **Don't spam with:** - Every tiny sub-task completion - Routine progress updates - Things they didn't ask about (unless relevant)
### Managing Scope Creep
If a task turns out to be bigger than expected: 1. Mark current task as `in_progress` 2. Add new sub-tasks for the pieces you discovered 3. Update dependencies 4. Continue with manageable chunks
## File Structure
All data stored in `data/tasks.json`:
```json { "goals": [ { "id": "goal_001", "title": "Build voice assistant hardware", "priority": "high", "context": "Replace Alexa with custom solution", "created_at": "2026-02-05T05:25:00Z", "status": "active" } ], "tasks": [ { "id": "task_001", "goal_id": "goal_001", "title": "Research voice-to-text models", "priority": "high", "status": "completed", "created_at": "2026-02-05T05:26:00Z", "completed_at": "2026-02-05T06:15:00Z", "notes": "Researched Whisper, Coqui, vosk. Whisper.cpp best for Pi." } ] } ```
## CLI Reference
See [CLI_REFERENCE.md](references/CLI_REFERENCE.md) for complete command documentation.
## Evolution & Guardrails
Before proposing new features, evaluate them using our **VFM/ADL scoring frameworks** to ensure stability and value:
### VFM Protocol (Value Frequency Multiplier) Score across four dimensions: - **High Frequency** (3x): Will this be used daily/weekly? - **Failure Reduction** (3x): Does this prevent errors or data loss? - **User Burden** (2x): Does this reduce manual work significantly? - **Self Cost** (2x): How much maintenance/complexity does this add?
**Threshold:** Must score ≥60 points to proceed.
### ADL Protocol (Architecture Design Ladder) **Priority ordering:** Stability > Explainability > Reusability > Scalability > Novelty
**Forbidden Evolution:** - ❌ Adding complexity to "look smart" - ❌ Unverifiable changes (can't test if it worked) - ❌ Sacrificing stability for novelty
**The Golden Rule:** "Does this let future-me solve more problems with less cost?" If no, skip it.
## Example Workflow
**Day 1:** ``` Human: "Let's build a custom voice assistant to replace Alexa" Agent: *Creates goal, breaks into initial research tasks* ```
**During heartbeat:** ```bash $ python3 scripts/task_manager.py next-task → task_001: Research voice-to-text models (priority: high)
# Agent works on it, completes research $ python3 scripts/task_manager.py complete-task task_001 --notes "..." ```
**Agent messages human:** > "Hey! I finished researching voice models. Whisper.cpp looks perfect for Raspberry Pi - runs locally, good accuracy, low latency. Want me to compare hardware options next?"
**Day 2:** ``` Human: "Yeah, compare Pi 5 vs alternatives" Agent: *Adds task, works on it during next heartbeat* ```
This cycle continues - the agent makes steady autonomous progress while keeping the human in the loop for decisions and updates.
---
Built by Toki for proactive AI partnership 🚀