February 11, 2026

Best Models to Run for OpenClaw in 2026

Everyone asks which model to use. The honest answer: it depends on what you’re doing and how much you want to spend.

OpenClaw supports a dozen providers. Anthropic, OpenAI, Google, open-source models through routers like OpenRouter and haimaker.ai. Each has trade-offs around cost, capability, and where your data ends up.

At founders.sh, OpenClaw is the engine behind every agent team we deploy. We’ve tested every major model across thousands of real agent tasks — research, outreach, content, ops — and have strong opinions about what works. Here they are.

Price, Capability, Privacy

These three things compete with each other. You can optimize for two, maybe, but rarely all three.

Price

Token pricing varies wildly. Claude Opus 4.5 costs $15 / $75 per million tokens (input / output). GPT-4.1 mini charges $0.40 / $1.60. That’s a 47x difference on output alone. For an OpenClaw agent running 24/7 across Telegram and web, those numbers compound fast.

For most assistant-level tasks, Claude Sonnet 4.5 at $3 / $15 gives you most of Opus’s capability at a fraction of the cost. That’s our default on founders.sh. Combine model choice with token optimization techniques to cut costs even further.

Capability

For OpenClaw specifically, what actually matters:

  • +Tool calling — can it invoke shell commands, APIs, and file operations without fumbling the syntax?
  • +Context tracking — does it remember what you said 50 messages ago in a Telegram thread?
  • +Instruction following — does it stick to the agent's role or go rogue?
  • +Speed — how long before the first token shows up in your chat?

Benchmarks lie. A model can score 95% on MMLU and still choke on a three-step shell command. We care about how it performs inside OpenClaw, not on a leaderboard.

Privacy

Cloud APIs mean your prompts hit external servers. For general business tasks — research, outreach, content — that’s fine. For sensitive financial data, proprietary strategy docs, or anything you wouldn’t email to a stranger, you might want open-source models on your own hardware. OpenClaw supports both through its provider system.

Recommendations by Use Case

Daily Driver

Claude Sonnet 4.5

$3 / $15 per million tokens · Anthropic

This is what we run by default on founders.sh. Sonnet handles research, drafts outreach emails, writes reports, and follows multi-step agent instructions without drifting. It’s fast enough for real-time Telegram chat and smart enough for complex multi-tool workflows.

Most people should start here. It handles most OpenClaw tasks well and won’t run up a scary bill.

// Set as your default in openclaw.json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-5-20250929"
      }
    }
  }
}
Heavy Lifting

Claude Opus 4.5

$15 / $75 per million tokens · Anthropic

The most capable model we’ve tested inside OpenClaw. Opus handles work that Sonnet sometimes struggles with: synthesizing 20-page documents, writing nuanced investor memos, running complex multi-step research chains where each step depends on the last.

The downside is obvious — it’s 5x the price. We recommend it for specific high-value tasks: due diligence, strategy documents, anything where output quality directly impacts a business decision. Don’t use it for routine email drafts.

// Override for a specific agent role
{
  "agents": {
    "defaults": {
      "model": { "primary": "anthropic/claude-sonnet-4-5-20250929" }
    },
    "researcher": {
      "model": { "primary": "anthropic/claude-opus-4-5-20250929" }
    }
  }
}
Budget All-Rounder

GPT-4.1

$2 / $8 per million tokens · OpenAI

Competitive with Sonnet on most tasks and slightly cheaper. Where it falls short inside OpenClaw: tool calling reliability and following complex structured output formats. For straightforward tasks — summarizing articles, writing first drafts, answering questions — it works well.

Cheaper option: GPT-4.1 mini at $0.40 / $1.60 works for simple, high-volume tasks with quality drops on anything complex.

Long Context

Gemini 2.5 Pro

$1.25 / $10 per million tokens · Google

Gemini’s killer feature is the context window. With 1M+ tokens, it can ingest an entire codebase, a 200-page legal doc, or months of meeting transcripts in a single pass. We use it for document-heavy research tasks where other models need multiple rounds of chunking and summarization.

The trade-off: tool calling and structured output are a step behind Claude inside OpenClaw. Great for analysis, less reliable for automation chains.

// Gemini as primary with Claude fallback
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "google/gemini-2.5-pro",
        "fallbacks": ["anthropic/claude-sonnet-4-5-20250929"]
      }
    }
  }
}
Privacy-First

Llama 3.3 70B / Qwen 2.5 72B

~$0.10–$0.50 per million tokens via routers · Free if self-hosted

If your data can’t leave your infrastructure, open-source is the play. Llama 3.3 70B is the best general-purpose open model right now. Qwen 2.5 72B edges it out on analytical tasks. Run them through OpenRouter, haimaker.ai, or self-host with Ollama or vLLM.

Reality check: these models are noticeably weaker at tool calling and multi-step agent workflows inside OpenClaw. They work best for single-turn tasks — summarization, classification, data extraction. For full agent automation, you’ll still want a frontier model in the loop.

// Self-hosted Ollama as a custom provider
{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://127.0.0.1:11434/v1",
        "api": "openai-completions",
        "models": [
          { "id": "llama3.3:70b", "name": "Llama 3.3 70B" }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": { "primary": "ollama/llama3.3:70b" }
    }
  }
}

Switching Models in OpenClaw

OpenClaw makes model switching trivial. You can change models in config, from the CLI, or mid-conversation.

In your config (~/.openclaw/openclaw.json)

Set agents.defaults.model.primary to any provider/model ref. Add fallbacks so OpenClaw automatically retries on rate limits or outages:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-5-20250929",
        "fallbacks": [
          "openai/gpt-4.1",
          "google/gemini-2.5-pro"
        ]
      }
    }
  }
}

From the CLI

# Set default model
openclaw models set anthropic/claude-sonnet-4-5-20250929

# List all available models
openclaw models list --all

# Add a fallback
openclaw models fallbacks add openai/gpt-4.1

# Check current config
openclaw models status

Mid-conversation

Use the /model command in any OpenClaw chat — Telegram, web, wherever:

/model                          # opens picker
/model opus                     # switch to Opus
/model openai/gpt-4.1           # switch to GPT-4.1
/model ollama/llama3.3:70b      # switch to local model

Quick Reference

ModelBest ForCost (in/out)OpenClaw Verdict
Claude Sonnet 4.5Default for everything$3 / $15Start here
Claude Opus 4.5Complex research & strategy$15 / $75Worth it for high-value tasks
GPT-4.1General tasks, drafts$2 / $8Solid budget option
GPT-4.1 miniHigh-volume simple work$0.40 / $1.60Cheap and fast
Gemini 2.5 ProLong documents, analysis$1.25 / $10Best context window
Llama 3.3 70BPrivacy-sensitive tasks~$0.10 / $0.50Your data stays home

The Bottom Line

There’s no single best model. There’s the right model for what you’re doing.

  • +Cheap: GPT-4.1 mini or open-source via OpenRouter
  • +Capable: Claude Opus 4.5 or Gemini 2.5 Pro
  • +Private: Llama 3.3 70B self-hosted with Ollama
  • +Best overall: Claude Sonnet 4.5

On founders.sh, we default to Claude Sonnet 4.5. It handles most tasks well and won’t run up a scary bill. OpenClaw’s fallback system means if Anthropic has a bad day, your agents automatically switch to the next best option without you noticing.

The model landscape changes fast. Six months ago, the answer was different. Six months from now, it will be again. We test continuously and update our defaults when something meaningfully better comes along. Adjust from there based on what you actually need.

Ready to put these models to work?

Get an AI agent team powered by OpenClaw, running in 60 seconds.

Get Started