# Install Ollama (macOS/Linux)
-weight: 500;">curl -fsSL https://ollama.com/-weight: 500;">install.sh | sh # Pull the recommended general-purpose model
ollama pull qwen3.5:27b # Pull the recommended reasoning model
ollama pull deepseek-r1:32b
# Install Ollama (macOS/Linux)
-weight: 500;">curl -fsSL https://ollama.com/-weight: 500;">install.sh | sh # Pull the recommended general-purpose model
ollama pull qwen3.5:27b # Pull the recommended reasoning model
ollama pull deepseek-r1:32b
# Install Ollama (macOS/Linux)
-weight: 500;">curl -fsSL https://ollama.com/-weight: 500;">install.sh | sh # Pull the recommended general-purpose model
ollama pull qwen3.5:27b # Pull the recommended reasoning model
ollama pull deepseek-r1:32b
# Run with 64K context (minimum for OpenClaw)
ollama run qwen3.5:27b --num-ctx 65536
# Run with 64K context (minimum for OpenClaw)
ollama run qwen3.5:27b --num-ctx 65536
# Run with 64K context (minimum for OpenClaw)
ollama run qwen3.5:27b --num-ctx 65536
{ "model": "qwen3.5:27b", "provider": "ollama", "baseUrl": "http://localhost:11434/v1"
}
{ "model": "qwen3.5:27b", "provider": "ollama", "baseUrl": "http://localhost:11434/v1"
}
{ "model": "qwen3.5:27b", "provider": "ollama", "baseUrl": "http://localhost:11434/v1"
} - Qwen3.5:27b is the best all-round local model for OpenClaw — 256K context, strong agentic performance, and 24GB VRAM with Q4 quantization.
- DeepSeek-R1-Distill-32B delivers the best local reasoning performance, outperforming OpenAI o1-mini on multiple benchmarks.
- Llama 4 Scout (17B active, 16 experts) offers a 10M context window and beats Gemma 3 and Gemini 2.0 Flash-Lite on broad benchmarks.
- Gemma 4 from Google is the newest entrant (April 2026), optimized for running on devices from phones to workstations.
- Hardware matters more than model choice — set Ollama to at least 64K context for OpenClaw, which means Q4_K_M quantization is the practical default for most operators. - Open-Source Model Rankings by Task Type
- Full Comparison Table
- Hardware Requirements and VRAM Guide
- Ollama Setup for OpenClaw
- Which Model Should You Pick?
- Limitations and Tradeoffs - General-purpose agent work: Start with qwen3.5:27b. It has the best balance of capability, context window, and hardware requirements across the family.
- Reasoning-heavy tasks: Use deepseek-r1:32b. Nothing else in the open-source local tier matches its math and logic performance.
- Coding agents: Use codestral for focused code generation, or qwen3-coder:30b if you need broader agentic capabilities alongside code.
- Budget hardware (8-16GB): Start with qwen3.5:9b or phi-4. Expect reduced capability compared to 27B+ models, but both are functional for lighter workflows.
- Maximum local quality: If you have 48GB+ VRAM, deepseek-r1:70b or the full Llama 4 Scout gives you the closest experience to cloud API quality. - Quality gap: Even the best open-source models trail frontier proprietary models on complex agentic tasks. Claude Opus 4.6 scores ~80% on SWE-bench Verified; the best open-source model (GLM-5) scores ~78%. For simpler tasks, the gap is much smaller.
- Context vs VRAM tradeoff: Running 64K+ context locally requires serious hardware. An 8B model at 128K context can consume 20GB+ of VRAM just for the KV cache, leaving little room for the model weights themselves.
- No guaranteed uptime: Local models depend on your hardware staying on and healthy. Cloud APIs offer reliability guarantees that local setups cannot match.
- Update lag: Open-source models -weight: 500;">update less frequently than hosted APIs. When DeepSeek or Qwen release a new version, Ollama support may lag by days or weeks.
- Quantization quality loss: Q4_K_M quantization typically loses less than 3% quality compared to full precision, but on edge cases and complex reasoning chains, the degradation can be more noticeable. - Best Ollama Models for OpenClaw
- GPU Optimization for Ollama and OpenClaw
- OpenClaw Ollama Setup Guide
- Ollama vs OpenRouter for OpenClaw