Route your boring tasks to a cheap model (and save 50% on API costs)

I watched someone spend $300 last month running Claude Opus for everything. Email summaries, file organization, status updates — all hitting the $15/million token model.

Meanwhile, another OpenClaw user cut their bill from $240 to $60 using model routing. Same agent, same tasks, 75% cost reduction.

The difference? The expensive setup treated every task like brain surgery. The cheap setup recognized that most agent work is actually pretty boring.

Here's the pattern that works:

Route by task complexity, not by default. Your agent spends 80% of its time on routine work that Gemini Flash ($0.075/1M tokens) handles perfectly. Save the expensive models for decisions that actually matter.

Here's a real config from someone who dropped their monthly bill by half:

model_routing:
  - condition: "task_type == 'file_organization'"
    model: "gemini-1.5-flash"
  - condition: "task_type == 'email_summary'"
    model: "gemini-1.5-flash"
  - condition: "task_type == 'status_update'"
    model: "gemini-1.5-flash"
  - condition: "complexity_score > 0.7"
    model: "claude-3-5-sonnet"
  - condition: "requires_reasoning == true"
    model: "claude-3-5-sonnet"
  - default: "gemini-1.5-flash"

The magic happens in that complexity scoring. OpenClaw can evaluate incoming tasks and route them appropriately. A Slack notification about a new file? Flash handles it for pennies. A customer escalation requiring nuanced judgment? That goes to Sonnet.

Pro tip: Add a fallback chain. If Flash can't handle something (returns "I need more capability" or similar), automatically retry with Sonnet. This catches the edge cases without manual intervention.

The results are immediate. One user shared their before/after:

Before: 100% Claude Opus, $280/month
After: 85% Gemini Flash + 15% Sonnet, $75/month
Quality difference: None for routine tasks, same excellence for complex work

You can get even more aggressive. Run truly simple tasks (like "check if this file exists") on local models via Ollama. That drops your per-task cost to essentially zero for the most basic operations.

The key insight: Your agent doesn't need to be Einstein for every single task. It needs to be Einstein when the situation calls for it, and a competent assistant the rest of the time.

Most people optimize their agent's capabilities. Smart operators optimize their agent's economics. Model routing is how you do both — maximum capability when needed, minimum cost everywhere else.

This isn't about cutting corners. It's about matching the right tool to the job. Your morning briefing doesn't need the same horsepower as strategic planning. Route accordingly, and watch your bills drop while your agent keeps performing.

Route your boring tasks to a cheap model (and save 50% on API costs)

Get tips like this every morning