How to Choose the Right AI Model for Your Agent

Last updated: April 21, 2026

Under Agent Preferences, your agent already has a model selected — by default it's set to a Recommended model (Claude 4.6 Sonnet). You don't need to change it. It handles most tasks well. This article explains when to change it and what to pick instead.


Where to find and change your model

In your agent config, the Agent Preferences section shows your currently selected model. Click that to open the model picker and choose a different one.

image.png

The model controls how your agent thinks — how fast it responds, how deeply it reasons, and how much text it can handle at once. The picker shows a Speed bar, an Intelligence bar, and a Context window size for each model.

image.png

When to change it and what to pick

  • You need faster responses

    Pick Fastest (Gemini 3 Flash) — it's pinned at the top of the picker. Use it for simple, high-volume tasks: classifying content, extracting specific fields, reformatting data. It's noticeably faster but less nuanced than the default.

  • The default isn't reasoning well enough

    Pick a Smartest model — Claude 4.7 Opus is the strongest option and what the Smartest preset selects by default. Other top-tier options include Claude 4.6 Opus, GPT-5.4, Gemini 3.1 Pro, and Grok 4. These all have the highest intelligence rating and are built for complex, multi-step reasoning: detailed analysis, difficult decisions, tasks where the default makes mistakes. They're slower and use more credits per message.

  • You want to reduce credit consumption

    Pick Gemini 3 FlashClaude 4.5 Haiku, or GPT-5.4 Nano. All three are fast, low-cost models that work well for straightforward tasks. Use them when your agent is handling high volumes or running on a tight credit budget and the task doesn't require deep reasoning.

  • You're getting a context window error

    You'll see such errors when your conversation or document is too long for the model. Switch to any model showing 1M in the Context column of the picker.

    prompt reached 213,009 tokens, which is over the 200,000 token maximum for Claude 4.5 Opus

    Context window

    Models to pick

    1M tokens

    Claude 4.7 Opus, Claude 4.6 Opus, Claude 4.6 Sonnet, Claude 4.5 Sonnet, GPT-5.4, Gemini 3.1 Pro, Gemini 3 Flash

    400K tokens

    GPT-5.4 Mini, GPT-5.4 Nano, GPT-5.3 Codex, GPT-5.2, GPT-5.2 Codex

    256K tokens

    Grok 4, Qwen3.5 397B, Kimi K2.5

    200K tokens

    Claude 4.5 Haiku

    128K tokens

    Grok 3, Grok 3 Mini, Perplexity Sonar Pro, Perplexity Sonar Reasoning Pro

    Tip: For long-running conversations, turn on Auto Summarization under Advanced next to the model selector. It compresses older messages automatically so you're less likely to hit the limit.


All available models at a glance

The model picker groups models by provider. Here's every active model, with its context window and what it's best for.

Provider

Model

Context

Best for

Anthropic

Claude 4.7 Opus

1M

Strongest reasoning overall. Complex analysis, multi-step decisions.

Anthropic

Claude 4.6 Opus

1M

Near-top intelligence with 1M context. Great all-round expert model.

Anthropic

Claude 4.6 Sonnet

1M

Default (Recommended). Best balance of speed, intelligence, and cost.

Anthropic

Claude 4.5 Haiku

200K

Fast and cheap. Simple classification, reformatting, high-volume tasks.

OpenAI

GPT-5.4

1M

Top-tier intelligence at scale. Agentic, coding, and professional workflows.

OpenAI

GPT-5.4 Mini

400K

Strong mini model for coding and sub-agents. Good speed/intelligence balance.

OpenAI

GPT-5.4 Nano

400K

Cheapest GPT-5.4-class model. Simple, high-volume tasks on a budget.

OpenAI

GPT-5.3 Codex

400K

Long-horizon agentic coding tasks. Most capable code-focused model.

OpenAI

GPT-5.2

400K

Strong general-purpose expert model for coding and agentic tasks.

OpenAI

GPT-5.2 Codex

400K

Intelligent coding model for long-horizon code generation and review.

Google

Gemini 3.1 Pro

1M

Expert-tier reasoning with the largest context window. Multimodal tasks.

Google

Gemini 3 Flash

1M

Fastest preset. Extremely fast with 1M context. Best speed-to-cost ratio.

xAI

Grok 4

256K

Expert-tier intelligence. Strong reasoning alternative to Opus/GPT-5.4.

xAI

Grok 3

128K

Solid advanced-tier model for general-purpose tasks.

xAI

Grok 3 Mini

128K

Fast and lightweight. Good budget option from xAI.

Qwen

Qwen3.5 397B

256K

Expert-tier. Strong at non-English content, math, and structured reasoning.

Moonshot

Kimi K2.5

256K

Expert-tier with strong agentic reasoning. Good for long, tool-heavy tasks.

Perplexity

Sonar Reasoning Pro

128K

Web-grounded reasoning with citations. Best when answers need real-time sources.

Perplexity

Sonar Pro

128K

Fast web search with cited answers. Good for quick fact lookups.

Perplexity

Sonar Deep Research

128K

Deep web research with synthesis. 100 credits per call — use for thorough investigations only.


A few things worth knowing

Your agent can't switch its own model

Once you set a model in Agent Preferences, it stays there for the entire conversation. Even if you tell your agent in its instructions to "use a different model for this step," it won't be able to do that. You would need to manually change it.

You can set a fallback model for outages and errors

Fallback is enabled by default. If a model goes down or returns an error, your agent automatically tries the next model in the chain — with no interruption. You can customize the fallback chain under Advanced next to the model selector.

image.png

Use a Drive link to share large files with your agent

Claude models have a 25–30 MB limit on the total size of files attached inline to a message. If you're attaching several large PDFs and seeing errors, this is likely why — it's separate from the context window.

Fix: upload your files to a Google Drive folder and share the folder link with your agent instead. Follow the steps here.

When to try models from other providers

The three pinned presets (Recommended, Smartest, Fastest) cover most use cases. If you need something specific, the provider lists give you more choice — Anthropic and OpenAI for the widest model range, Google for large context and multimodal tasks, xAI (Grok) as a strong reasoning alternative, Qwen for non-English content or math-heavy tasks, Moonshot (Kimi) for agentic tasks requiring strong tool-calling, and Perplexity for tasks that need web-grounded answers with citations.

image.png

Start with the defaults and only switch if you have a reason to. Test your agent before committing to a new model.


Model selection in workflows

In workflows, each AI node (Ask AI, Extract Data, Categorizer, etc.) has its own Model dropdown. Unlike agents, workflow costs are fixed — you know exactly what each node will cost before it runs.

Tier

Credits per node call

Example models

Budget

2 credits

GPT-5.4 Mini, GPT-5.4 Nano, Claude 4.5 Haiku, Gemini 3 Flash, Grok 3 Mini

Advanced

20 credits

Claude 4.6 Sonnet, Claude 4.5 Sonnet, Grok 3, Perplexity Sonar Reasoning Pro

Expert

30 credits

Claude 4.7 Opus, Claude 4.6 Opus, GPT-5.4, GPT-5.3 Codex, Gemini 3.1 Pro, Grok 4, Qwen3.5 397B, Kimi K2.5

Research

100 credits

Perplexity Sonar Deep Research

image.png

You can use different models across nodes in the same workflow — a cheaper model for simple steps, a more powerful one where the task needs it. Hover over the ? icon on any node to see its credit cost before running.


Still Need Help?

If you're not sure which model to pick for your use case, reach out at support@gumloop.com or in the shared Slack channel.