OpenCode School

Lesson 6

Models

Choose and configure AI models in OpenCode.

The model you choose is probably the single most important factor in how successful your OpenCode sessions will be. State-of-the-art models from Anthropic and OpenAI are highly capable — they write better code, reason through complex problems, and make fewer mistakes. But they cost money per request. Open-source alternatives are faster and cheaper, sometimes free, but they don’t match the top models on difficult tasks. Using OpenCode well means finding the right balance between cost and capability for what you’re trying to do.

An AI model is the engine behind that capability. When you type a message, it’s a model that reads it and figures out how to respond — whether that’s writing code, explaining an error, or running a command.

See which model you’re using

Your current model is shown just below the text input in the Desktop app. When you first install OpenCode, you get access to free models through OpenCode Zen — a curated set of models provided by the OpenCode team that have been tested to work well with OpenCode. These are fine for getting started, but they have limitations — most lack features like vision (the ability to interpret images), and they may be less capable on complex coding tasks.

Change the model

To switch models, click the model name below the text input to open the model picker. You can also type /model to see the full list of available models from all your configured providers.

Built-in models

OpenCode Zen includes several free models out of the box. At the time of writing, these include:

  • Big Pickle — A stealth model optimized for coding agents. Currently free while the OpenCode team collects feedback.
  • GPT-5 Nano — OpenAI’s smallest and fastest model. Good for quick tasks, but less capable on complex reasoning.
  • Nemotron 3 Super Free — NVIDIA’s open-weight model with a 1M token context window, designed for agentic workflows.
  • MiMo V2 Flash Free — Xiaomi’s open-source model, one of the fastest available. Strong at coding and reasoning.
  • MiniMax M2.5 Free — An open-weight model by MiniMax. Strong at coding and agentic tool use.

The specific models may change over time.

These models cost nothing to use, which makes them great for learning and experimenting. But they come with tradeoffs: most lack vision support, and they tend to be less capable on complex tasks compared to paid options. If you find OpenCode struggling with something, upgrading to a better model is often the fix.

To set up Zen, click the model name below the prompt and then the settings button (in the terminal app, type /connect), select “OpenCode Zen”, and follow the prompts.

OpenCode Go

OpenCode Go is a subscription plan that gives you access to capable open-source models for $10/month ($5 for your first month). It currently includes four models:

  • GLM-5
  • Kimi K2.5
  • MiniMax M2.5
  • MiniMax M2.7

Kimi K2.5 is worth calling out — it has vision support, meaning it can interpret images you paste into the conversation. This is useful for tasks like implementing a design from a screenshot, debugging a UI, or reading text from an image.

To set up Go, click the model name below the prompt and then the settings button (in the terminal app, type /connect), select “OpenCode Go”, and follow the prompts.

Cloudflare

If you already have a Cloudflare account, you can use it as a model provider in OpenCode. One bill, no extra subscriptions.

AI Gateway is a proxy that sits between OpenCode and providers like OpenAI, Anthropic, Google AI Studio, and others. With Unified Billing, you load credits into your Cloudflare account and access models from multiple providers without needing separate API keys for each one. You get a single Cloudflare bill instead of managing accounts with every provider individually.

Cloudflare also offers Workers AI, which runs open-source models directly on Cloudflare’s network. It’s more limited than AI Gateway (only the models Cloudflare hosts), but it’s there if you want it.

To set up either provider, see the Cloudflare AI Gateway and Cloudflare Workers AI sections of the providers documentation. For a hands-on walkthrough, the Use an AI gateway exercise covers the full setup once you’ve finished the lessons.

Cloudflare

You are connected to Cloudflare’s AI gateway, which gives you access to Claude, GPT, Gemini, and other models at no cost. The default model is Kimi K2.5, which is fast but less capable on complex tasks. Claude Sonnet (anthropic/claude-sonnet-4) is a better choice for this course. Click the model name below the text input to switch.

Gateway auth expires every 30 days. The smart wrapper in your shell handles this automatically when you launch OpenCode from a terminal. If you use Desktop and auth expires, open a terminal and type opencode to re-authenticate, then restart Desktop. Or manually:

opencode auth login https://opencode.cloudflare.dev

Context and context windows

Every model has a context window — the maximum amount of text it can “see” at once during a conversation. This includes your messages, any file contents you’ve shared, tool outputs, and the model’s own responses. It’s measured in tokens (roughly ¾ of a word).

As a conversation grows longer, the context window fills up. When it’s full, one of a few things can happen: the model may start forgetting earlier parts of the conversation, responses may slow down, or OpenCode may trigger compaction — automatically summarizing older parts of the conversation to free up space. This lets the session continue, but the model is now working from a summary rather than the full original context.

Context window sizes vary a lot between models:

  • Large flagship models (Claude, GPT-5, Gemini) typically have context windows of 128K tokens or more — some go up to 1 million tokens. At that scale, you can work through long sessions, paste in large files, and run many tool calls without worrying about running out of space.
  • Smaller and cheaper models often have context windows of 8K–32K tokens. That’s enough for focused tasks, but it fills up quickly if you’re pasting in big files or having a long back-and-forth.

What this means in practice

With a big flagship model, you often don’t need to think about context at all. The window is large enough that typical sessions — even complex ones — don’t come close to filling it.

With a smaller model, context management matters more. A few tips:

  • Keep sessions focused. One task per session is easier on the context window than trying to do everything in one long conversation.
  • Start fresh for new tasks. If you’ve finished one thing and want to move on to something unrelated, start a new session rather than continuing the old one.
  • Watch for signs of drift. If the model starts giving responses that seem to ignore earlier instructions or context, the window may be getting full. Starting a fresh session usually helps.

Other providers

OpenCode supports 75+ model providers. If you already use Anthropic, OpenAI, Google, or another provider, you can connect it to OpenCode. Click the model name below the prompt and then the settings button (in the terminal app, type /connect) and select your provider from the list.

The latest flagship models from Anthropic and OpenAI are generally the most capable options available. If you can afford to use them, you’ll get better results.

For full setup details, see the providers documentation.