14+ AI providers.
Your rules.

OpenClaw works with every major AI provider. Bring your own API key — use Anthropic, OpenAI, Google, Groq, Mistral and 9+ more. Switch providers anytime without losing your data or history.

14+
AI providers
50+
Curated models
BYOK
Bring your own key

Why bring your own API key?

Most AI assistant platforms bundle a specific model — you use what they give you. OpenClaw takes a different approach: you connect directly to the provider of your choice. Pay provider prices, own your data, and switch models as the AI landscape evolves without re-learning a new tool.

Pay provider prices — no AI markup
Switch models without losing history
Use any model released tomorrow
Your usage data stays with your provider
No vendor lock-in to one AI company
Use free tiers from Cloudflare, HuggingFace

Flagship Providers

The most capable options for autonomous agents — top benchmark scores, best tool use reliability.

Anthropic (Claude)

Most trusted for complex agentic tasks

Claude Sonnet and Opus consistently top agentic benchmarks. Chosen by OpenClaw's own development team for multi-step autonomous workflows.

Recommended:Claude Sonnet 4.6

OpenAI

Industry standard — rock-solid tool calling

GPT-4o delivers best-in-class function calling with near-perfect reliability. The default choice for users who want zero surprises.

Recommended:GPT-4o

Google (Gemini)

#1 on 16 agentic benchmarks, 2M context

Gemini 3.1 Pro scored 77% on ARC-AGI-2 — twice the reasoning of previous models. Handles 100 tools simultaneously with Gemini 3 Flash.

Recommended:Gemini 3.1 Pro

xAI (Grok)

Grok 4.1 — 2M context for long tool chains

Grok's 2M context window makes it exceptional for long-running autonomous tasks that chain dozens of tool calls across hours of conversation.

Recommended:Grok 4.1

Speed & Scale

Low-latency inference for agents that need to respond in real-time or handle high request volume.

Groq

Fastest inference available — 90%+ BFCL score

Groq's LPU hardware runs Llama 3.3 70B at sub-100ms latency. If your agent needs to respond in real-time, Groq is the only option.

Recommended:Llama 3.3 70B

OpenRouter

One API key — access to 100+ models

The smart choice: OpenRouter Auto selects the optimal model per prompt at runtime. Great if you want flexibility without managing multiple keys.

Recommended:Auto (Dynamic Selection)

Open-Source Models

Access the best open-source models through managed inference — no GPU required.

Together AI

Fast open-source inference — Llama, DeepSeek & more

Together AI hosts the best open-source models including Llama 3.3 70B Turbo and DeepSeek R1 at 79.8% SWE-bench. Excellent cost-per-token.

Recommended:Llama 3.3 70B Turbo

Hugging Face

Any open-source model, serverless or dedicated endpoints

Access 500,000+ models through Hugging Face Inference. Run Llama, Qwen, Mistral, and more without managing your own GPU infrastructure.

Recommended:Llama 3.3 70B

Cloudflare Workers AI

Edge inference with a generous free tier

Run Llama 3.3 70B at Cloudflare's global edge with a free tier that's genuinely useful. Zero setup — just your Cloudflare account API token.

Recommended:Llama 3.3 70B

Venice AI

Private inference — zero data logging

Venice routes your requests to open-source models with guaranteed no-log privacy. Best for users who want open-source quality with maximum privacy.

Recommended:Llama 3.3 70B

International Models

World-class models from Asia and Europe — often leading on specific benchmarks.

MiniMax

80.2% SWE-Bench — SOTA for agentic coding workflows

MiniMax M2.5 topped SWE-Bench Verified with 80.2% — making it one of the best models available for complex coding and automation tasks.

Recommended:MiniMax M2.5

Moonshot (Kimi)

Visual agentic — 100+ parallel sub-agents

Kimi K2.5 handles 200+ tool chains with swarm-like sub-agent execution. Built for complex multi-step workflows that require visual intelligence.

Recommended:Kimi K2.5

Zhipu AI (GLM)

73.8% SWE-bench — top Chinese model for agentic coding

GLM-4.7 scores 73.8% on SWE-bench and 87.4 on tau-Bench. Strong multilingual support and excellent for technical workflows.

Recommended:GLM-4.7

Mistral AI

European AI leader — outperforms GPT-4o on function calling

Mistral Large beats GPT-4o on parallel and sequential function calling benchmarks. Built in France, GDPR-native, and fast.

Recommended:Mistral Large

Why tool use reliability matters

OpenClaw is an autonomous agent, not just a chatbot. It uses tools constantly — reading your calendar, sending messages, browsing the web, running scheduled tasks. A model that misunderstands tool semantics even 10% of the time causes real failures: cron jobs that never send, files that never save, reminders that never trigger.

Every provider in OpenClaw has been vetted for tool use reliability. Models that failed our internal tests (DeepSeek Reasoner, Gemini 2.5 Flash Lite) are excluded regardless of their other benchmark scores. The star ratings reflect Berkeley Function Calling Leaderboard (BFCL) and SWE-Bench results, not marketing claims.

Your instance, your provider

Set up in 60 seconds — switch models anytime

Sign up, deploy your instance, paste your API key. That's it. You can change providers or models at any time from your dashboard without losing conversations, memory, or integrations.