February 13, 2026

Reduce Your OpenClaw Costs by 97%: Token Optimization Guide

The default OpenClaw config prioritizes capability over cost. You’re probably burning through tokens on routine tasks that don’t need expensive models. Here’s how to go from $1,500+/month to under $50.

97%

Cost reduction

5 min

To implement

<$50

Monthly target

If you’ve been running OpenClaw and watching your API bills climb, you’re not alone. This guide covers five optimizations that work together to slash costs: session initialization, model routing, local heartbeat, rate limits, and prompt caching.

1. Session Initialization

The Problem

Your agent loads 50KB of history on every message. This wastes 2–3M tokens per session and costs $4/day. If you’re using third-party messaging interfaces without built-in session clearing, this problem compounds fast.

The fix: add a session initialization rule to your agent’s instructions. Tell it exactly what to load — and what NOT to load — at session start.

# Session Initialization Rule (add to AGENTS.md)

On every session start:
1. Load ONLY these files:
   - SOUL.md
   - USER.md
   - IDENTITY.md
   - memory/YYYY-MM-DD.md (if it exists)

2. DO NOT auto-load:
   - MEMORY.md
   - Session history
   - Prior messages
   - Previous tool outputs

3. When user asks about prior context:
   - Use memory_search() on demand
   - Pull only the relevant snippet
   - Don't load the whole file

4. Update memory/YYYY-MM-DD.md at end of session:
   - What you worked on
   - Decisions made
   - Blockers and next steps

Before

×50KB context on startup
×2–3M tokens wasted per session
×$0.40 per session
×History bloat over time

After

+8KB context on startup
+Loads only what's needed
+$0.05 per session
+Clean daily memory files

2. Model Routing

Out of the box, OpenClaw defaults to Claude Sonnet for everything. While Sonnet is excellent, it’s overkill for checking file status, running simple commands, or routine monitoring. Haiku handles these perfectly at a fraction of the cost. For a full breakdown of which models to use when, see our model selection guide.

// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-haiku-4-5"
      }
    }
  }
}

Then add routing rules to your agent instructions:

# Model Selection Rule (add to AGENTS.md)

Default: Always use Haiku
Switch to Sonnet ONLY when:
- Architecture decisions
- Production code review
- Security analysis
- Complex debugging/reasoning
- Strategic multi-project decisions

When in doubt: try Haiku first.

Metric	Before (Sonnet)	After (Haiku default)
Cost per 1K tokens	$0.003	$0.00025
Monthly model cost	$50–$70	$5–$10
Routine task speed	Overkill	Right-sized

3. Route Heartbeat to Haiku

OpenClaw sends periodic heartbeat checks to verify your agent is alive. By default, these use your primary model — which adds up fast when running 24/7. Route them to Haiku 4.5, the cheapest Claude model, and reduce the frequency to once per hour.

// Add to ~/.openclaw/openclaw.json
{
  "heartbeat": {
    "every": "1h",
    "model": "anthropic/claude-haiku-4-5",
    "session": "main",
    "prompt": "Check: Any blockers or progress updates?"
  }
}

Result

Before: 1,440 Sonnet API calls/day, $5–$15/month just for heartbeats. After: 24 Haiku calls/day, under $0.50/month. Haiku 4.5 is 12x cheaper than Sonnet and handles heartbeat context perfectly.

4. Rate Limits & Budget Controls

Even with model routing and optimized sessions, runaway automation can burn through tokens. These guardrails prevent accidental cost explosions.

# Rate Limits (add to AGENTS.md)

RATE LIMITS:
- 5 seconds minimum between API calls
- 10 seconds between web searches
- Max 5 searches per batch, then 2-minute break
- Batch similar work (one request for 10 leads,
  not 10 requests for 1 lead each)
- If you hit 429 error: STOP, wait 5 minutes, retry

DAILY BUDGET: $5 (warning at 75%)
MONTHLY BUDGET: $200 (warning at 75%)

Limit	What It Prevents
5s between API calls	Rapid-fire token burn
10s between searches	Expensive search loops
5 searches max, then break	Runaway research tasks
Batch similar work	10 calls when 1 would do
Budget warnings at 75%	Surprise bills

5. Prompt Caching

Your system prompt, workspace files, and reference materials get sent to the API with every single message. Prompt caching (available on Claude 3.5 Sonnet and newer) charges only 10% for cached tokens on re-use. For static content you send repeatedly, this cuts costs by 90%.

How It Works

First request: full price. Claude stores it in cache. Subsequent requests within 5 minutes: 90% discount.

A 5KB system prompt costs ~$0.015 on first use, then $0.0015 on each reuse. Over 100 calls/week, you save ~$1.30/week just on system prompts.

What to cache vs. not

Cache These (stable)

+System prompts
+SOUL.md / USER.md
+Reference materials, docs, specs
+Tool documentation
+Project templates

Don’t Cache (dynamic)

×Daily memory files
×Recent user messages
×Tool outputs
×Frequently updated notes

// Enable caching in ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "cache": {
        "enabled": true,
        "ttl": "5m",
        "priority": "high"
      }
    }
  }
}

Maximize cache hits

+Batch requests: Make multiple API calls within 5-minute windows to reuse cached prompts
+Keep prompts stable: Don't update SOUL.md mid-session — changes invalidate cache
+Separate stable from dynamic: Reference docs (cached) vs. daily notes (uncached) in different files
+Target >80% hit rate: Monitor with session_status and adjust if cache misses are high

Example: 50 outreach drafts/week	Without cache	With cache
System prompt cost	$0.75/week	$0.02/week
Draft generation	$1.20/week	$0.60/week
Monthly total	$102	$32

Combined Impact

Each optimization targets a different cost driver. Together, they compound:

Optimization	Before	After	With Cache
Session init	$0.40	$0.05	$0.005
Model routing	$0.05	$0.02	$0.002
Heartbeat	$0.02	$0	$0
Rate limits	—	$0	$0
Prompt caching	—	—	−$0.015

$2–$3

Daily before

$0.10

Daily after

$3–$5

Monthly after

Verifying Your Setup

# Start a session and check status
openclaw shell
session_status

# You should see:
# - Context size: 2-8KB (not 50KB+)
# - Model: Haiku (not Sonnet)
# - Heartbeat: Haiku (not Sonnet)
# - Cache hit rate: >80%

Troubleshooting

+Context still large: Check session initialization rules are in your AGENTS.md
+Still using Sonnet: Verify openclaw.json syntax and model.primary path
+Heartbeat errors: Verify heartbeat.model is set to anthropic/claude-haiku-4-5 in openclaw.json
+Costs haven't dropped: Check that your system prompt is actually being loaded

The Bottom Line

Five optimizations. Five minutes each. Combined result: from $1,500+/month to under $50. That’s money you can reinvest in actually building things.

These savings are especially impactful for high-volume workflows like automated cold outreach or AI-generated content at scale.

On founders.sh, we apply these optimizations by default on every agent container we deploy. You get model routing, session management, and cost controls out of the box — without touching a config file.

Want optimized agents without the setup?

founders.sh agents come pre-configured with cost optimization, model routing, and smart session management — running in 60 seconds.

Get Started

Keep Reading

Best Models to Run for OpenClaw in 2026

Anthropic, OpenAI, Google, open-source — trade-offs around cost, capability, and privacy for every provider.

How to Book 60+ Calls/Month With AI Outreach Agents

The complete multi-channel system for automating cold outreach across Twitter, LinkedIn, and email.