Introduction
# LLM Router
An intelligent proxy that classifies incoming requests by complexity and routes them to appropriate LLM models. Use cheaper/faster models for simple tasks and reserve expensive models for complex ones.
**Works with [OpenClaw](https://github.com/openclaw/openclaw)** to reduce token usage and API costs by routing simple requests to smaller models.
**Status:** Tested with Anthropic, OpenAI, Google Gemini, Kimi/Moonshot, and Ollama.
## Quick Start
### Prerequisites
1. **Python 3.10+** with pip 2. **Ollama** (optional - only if using local classification) 3. **Anthropic API key** or Claude Code OAuth token (or other provider key)
### Setup
```bash # Clone if not already present git clone https://github.com/alexrudloff/llmrouter.git cd llmrouter
# Create virtual environment (required on modern Python) python3 -m venv venv source venv/bin/activate
# Install dependencies pip install -r requirements.txt
# Pull classifier model (if using local classification) ollama pull qwen2.5:3b
# Copy and customize config cp config.yaml.example config.yaml # Edit config.yaml with your API key and model preferences ```
### Verify Installation
```bash # Start the server source venv/bin/activate python server.py
# In another terminal, test health endpoint curl http://localhost:4001/health # Should return: {"status": "ok", ...} ```
### Start the Server
```bash python server.py ```
Options: - `--port PORT` - Port to listen on (default: 4001) - `--host HOST` - Host to bind (default: 127.0.0.1) - `--config PATH` - Config file path (default: config.yaml) - `--log` - Enable verbose logging - `--openclaw` - Enable OpenClaw compatibility (rewrites model name in system prompt)
## Configuration
Edit `config.yaml` to customize:
### Model Routing
```yaml # Anthropic routing models: super_easy: "anthropic:claude-haiku-4-5-20251001" easy: "anthropic:claude-haiku-4-5-20251001" medium: "anthropic:claude-sonnet-4-20250514" hard: "anthropic:claude-opus-4-20250514" super_hard: "anthropic:claude-opus-4-20250514"
# OpenAI routing models: super_easy: "openai:gpt-4o-mini" easy: "openai:gpt-4o-mini" medium: "openai:gpt-4o" hard: "openai:o3-mini" super_hard: "openai:o3"
# Google Gemini routing models: super_easy: "google:gemini-2.0-flash" easy: "google:gemini-2.0-flash" medium: "google:gemini-2.0-flash" hard: "google:gemini-2.0-flash" super_hard: "google:gemini-2.0-flash" ```
**Note:** Reasoning models are auto-detected and use correct API params.
### Classifier
Three options for classifying request complexity:
**Local (default)** - Free, requires Ollama: ```yaml classifier: provider: "local" model: "qwen2.5:3b" ```
**Anthropic** - Uses Haiku, fast and cheap: ```yaml classifier: provider: "anthropic" model: "claude-haiku-4-5-20251001" ```
**OpenAI** - Uses GPT-4o-mini: ```yaml classifier: provider: "openai" model: "gpt-4o-mini" ```
**Google** - Uses Gemini: ```yaml classifier: provider: "google" model: "gemini-2.0-flash" ```
**Kimi** - Uses Moonshot: ```yaml classifier: provider: "kimi" model: "moonshot-v1-8k" ```
Use remote (anthropic/openai/google/kimi) if your machine can't run local models.
### Supported Providers
- `anthropic:claude-*` - Anthropic Claude models (tested) - `openai:gpt-*`, `openai:o1-*`, `openai:o3-*` - OpenAI models (tested) - `google:gemini-*` - Google Gemini models (tested) - `kimi:kimi-k2.5`, `kimi:moonshot-*` - Kimi/Moonshot models (tested) - `local:model-name` - Local Ollama models (tested)
## Complexity Levels
| Level | Use Case | Default Model | |-------|----------|---------------| | super_easy | Greetings, acknowledgments | Haiku | | easy | Simple Q&A, reminders | Haiku | | medium | Coding, emails, research | Sonnet | | hard | Complex reasoning, debugging | Opus | | super_hard | System architecture, proofs | Opus |
## Customizing Classification
Edit `ROUTES.md` to tune how messages are classified. The classifier reads the table in this file to determine complexity levels.
## API Usage
The router exposes an OpenAI-compatible API:
```bash curl http://localhost:4001/v1/chat/completions \ -H "Authorization: Bearer $ANTHROPIC_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "llm-router", "messages": [{"role": "user", "content": "Hello!"}] }' ```
## Testing Classification
```bash python classifier.py "Write a Python sort function" # Output: medium
python classifier.py --test # Runs test suite ```
## Running as macOS Service
Create `~/Library/LaunchAgents/com.llmrouter.plist`:
```xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>com.llmrouter</string> <key>ProgramArguments</key> <array> <string>/path/to/llmrouter/venv/bin/python</string> <string>/path/to/llmrouter/server.py</string> <string>--openclaw</string> </array> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <true/> <key>WorkingDirectory</key> <string>/path/to/llmrouter</string> <key>StandardOutPath</key> <string>/path/to/llmrouter/logs/stdout.log</string> <key>StandardErrorPath</key> <string>/path/to/llmrouter/logs/stderr.log</string> </dict> </plist> ```
**Important:** Replace `/path/to/llmrouter` with your actual install path. Must use the venv python, not system python.
```bash # Create logs directory mkdir -p ~/path/to/llmrouter/logs
# Load the service launchctl load ~/Library/LaunchAgents/com.llmrouter.plist
# Verify it's running curl http://localhost:4001/health
# To stop/restart launchctl unload ~/Library/LaunchAgents/com.llmrouter.plist launchctl load ~/Library/LaunchAgents/com.llmrouter.plist ```
## OpenClaw Configuration
Add the router as a provider in `~/.openclaw/openclaw.json`:
```json { "models": { "providers": { "localrouter": { "baseUrl": "http://localhost:4001/v1", "apiKey": "via-router", "api": "openai-completions", "models": [ { "id": "llm-router", "name": "LLM Router (Auto-routes by complexity)", "reasoning": false, "input": ["text", "image"], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 200000, "maxTokens": 8192 } ] } } } } ```
**Note:** Cost is set to 0 because actual costs depend on which model the router selects. The router logs which model handled each request.
### Set as Default Model (Optional)
To use the router for all agents by default, add:
```json { "agents": { "defaults": { "model": { "primary": "localrouter/llm-router" } } } } ```
### Using with OAuth Tokens
If your `config.yaml` uses an Anthropic OAuth token from OpenClaw's `~/.openclaw/auth-profiles.json`, the router automatically handles Claude Code identity headers.
### OpenClaw Compatibility Mode (Required)
**If using with OpenClaw, you MUST start the server with `--openclaw`:**
```bash python server.py --openclaw ```
This flag enables compatibility features required for OpenClaw: - Rewrites model names in responses so OpenClaw shows the actual model being used - Handles tool name and ID remapping for proper tool call routing
Without this flag, you may encounter errors when using the router with OpenClaw.
## Common Tasks
- **Check server status**: `curl http://localhost:4001/health` - **View current config**: `cat config.yaml` - **Test a classification**: `python classifier.py "your message"` - **Run classification tests**: `python classifier.py --test` - **Restart server**: Stop and run `python server.py` again - **View logs** (if running as service): `tail -f logs/stdout.log`
## Troubleshooting
### "externally-managed-environment" error Python 3.11+ requires virtual environments. Create one: ```bash python3 -m venv venv source venv/bin/activate pip install -r requirements.txt ```
### "Connection refused" on port 4001 Server isn't running. Start it: ```bash source venv/bin/activate && python server.py ```
### Classification returns wrong complexity Edit `ROUTES.md` to tune classification rules. The classifier reads this file to determine complexity levels.
### Ollama errors / "model not found" Ensure Ollama is running and the model is pulled: ```bash ollama serve # Start Ollama if not running ollama pull qwen2.5:3b ```
### OAuth token not working Ensure your token in `config.yaml` starts with `sk-ant-oat`. The router auto-detects OAuth tokens and adds required identity headers.
### LaunchAgent not starting Check logs and ensure paths are absolute: ```bash cat ~/Library/LaunchAgents/com.llmrouter.plist # Verify paths cat /path/to/llmrouter/logs/stderr.log # Check for errors ```