
TL;DR
GPT-5.6 Sol dropped on June 26, 2026 as a limited preview with government-imposed access restrictions. Here is what developers need to know about the three-tier Sol/Terra/Luna model family, pricing, availability timeline, and how to prepare your codebase for GA.
| Resource | Link |
|---|---|
| OpenAI GPT-5.6 Preview Announcement | openai.com/index/previewing-gpt-5-6-sol |
| GPT-5.6 Help Article | help.openai.com/en/articles/20001325 |
| OpenAI API Pricing | openai.com/api/pricing |
| OpenAI Models Documentation | platform.openai.com/docs/models |
| Terminal-Bench 2.1 Evaluation | github.com/terminal-bench |
Last updated: July 5, 2026
GPT-5.6 Sol is OpenAI's new frontier model, announced June 26, 2026. If you're reading this hoping to flip a flag and start building, I have bad news: you probably can't use it yet. The model launched as a limited preview under government-imposed access restrictions, with around 20 approved organizations able to access it through the API and Codex.
That said, the three-tier Sol/Terra/Luna pricing structure and the benchmark numbers OpenAI has shared tell us what to expect when general availability arrives. This guide covers what we know, what we're still waiting on, and how to prepare your codebase now so you can migrate fast when the API opens up.
GPT-5.6 ships as a family of three models, each optimized for different workloads. This is a shift from the previous pattern where you had a base model and a "Pro" variant. Now you get three distinct tiers with clear use-case separation.
| Model | Target Workload | Input ($/MTok) | Output ($/MTok) | Cache Write | Cached Read |
|---|---|---|---|---|---|
| Sol | Complex reasoning, agentic tasks, coding, security | $5.00 | $30.00 | $6.25 | $0.50 |
| Terra | Production workloads, everyday tasks | $2.50 | $15.00 | $3.125 | $0.25 |
| Luna | High-volume, latency-sensitive applications | $1.00 | $6.00 | $1.25 | $0.10 |
Cache mechanics: writes are billed at 1.25x the uncached input rate, and cached reads receive a 90% discount. That makes the caching story significantly better than previous generations.
Sol is the flagship. Use it when correctness matters more than cost: agentic coding workflows, security research, multi-step planning, and anything where a wrong answer creates real problems.
Terra is positioned as the balanced option - near GPT-5.5 performance at roughly half the cost. This is likely where most production traffic will land when GA arrives.
Luna is the fast, cheap tier. Think chatbots, classification, real-time applications, and anywhere latency beats everything else.
OpenAI also mentions a Sol Ultra mode that pushes maximum reasoning capability through extended compute, similar to how GPT-5.5 Pro worked with high effort settings. Ultra mode uses subagent-based decomposition for parallel workflows on complex tasks.
OpenAI's evaluation data is still sparse, but the numbers we have suggest meaningful improvements in specific domains.
Terminal-Bench 2.1 (agentic coding benchmark):
The 3.1-point gap between Ultra and base mode reflects increased compute spending for multi-step agentic problems. For straightforward tasks, Sol base is probably sufficient.
GeneBench v1 (genomics analysis): Sol uses fewer tokens than GPT-5.5 while producing stronger results on quantitative biology tasks. No specific scores published yet.
Cybersecurity: OpenAI describes Sol as the "strongest model for cybersecurity so far" with vulnerability research capability. The important qualifier: it "does not autonomously generate a full usable attack chain" in their testing. That's a safety boundary, not a capability gap.
Context window size hasn't been officially published. Rumors mention 1.5M tokens, but verify against the official docs when they update.
Newsletter
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.
From the archive
Jul 5, 2026 • 7 min read
Jul 5, 2026 • 5 min read
Jul 5, 2026 • 7 min read
Jul 4, 2026 • 8 min read
As of July 5, 2026, GPT-5.6 is in limited preview:
The access restrictions stem from the model's capabilities in cybersecurity and biology research. The U.S. government requested vetting of approved organizations before broad deployment.
If you need frontier model capabilities right now, GPT-5.5 and GPT-5.5 Pro remain generally available. For agentic coding specifically, Claude Fable 5 (restored July 1, 2026) and Claude Sonnet 5 offer strong alternatives while you wait.
You can't use GPT-5.6 yet, but you can prepare for migration now. Here's what I'm doing in production codebases that will switch when GA drops.
If you're hardcoding model: "gpt-5.5" everywhere, now is the time to fix that. Use a configuration layer that lets you swap models without touching application code.
// config/models.ts
export const models = {
fast: process.env.MODEL_FAST || "gpt-5.5",
balanced: process.env.MODEL_BALANCED || "gpt-5.5",
flagship: process.env.MODEL_FLAGSHIP || "gpt-5.5-pro",
} as const;
// When GPT-5.6 GA drops, update .env:
// MODEL_FAST=gpt-5.6-luna
// MODEL_BALANCED=gpt-5.6-terra
// MODEL_FLAGSHIP=gpt-5.6-sol
The 90% discount on cached reads makes prompt caching significantly more attractive. If you're not using prompt caching today, the 5.6 pricing structure is a reason to start.
import OpenAI from "openai";
const client = new OpenAI();
// Build cacheable system prompts
const systemPrompt = await client.responses.create({
model: "gpt-5.6-terra", // swap when available
input: [
{
role: "system",
content: buildSystemPrompt(context), // make this deterministic
},
{ role: "user", content: userMessage },
],
// Cache hits will cost 90% less on input
});
The three-tier model means you'll want routing logic. Not every request needs Sol.
type Complexity = "simple" | "standard" | "complex";
function selectModel(complexity: Complexity): string {
const modelMap = {
simple: "gpt-5.6-luna",
standard: "gpt-5.6-terra",
complex: "gpt-5.6-sol",
};
return modelMap[complexity];
}
// In your agent or pipeline
const model = selectModel(taskComplexity);
const response = await client.responses.create({
model,
input: taskPrompt,
});
When GA arrives, you'll want to compare 5.6 against your current stack on real traffic. Build the eval harness now.
async function compareModels(prompt: string, expected: string) {
const [current, next] = await Promise.all([
runWithModel("gpt-5.5", prompt),
runWithModel("gpt-5.6-terra", prompt), // swap when available
]);
return {
currentAccuracy: score(current, expected),
nextAccuracy: score(next, expected),
currentCost: current.usage.total_tokens * CURRENT_PRICE,
nextCost: next.usage.total_tokens * NEXT_PRICE,
};
}
The practical question for most developers is whether to wait for GPT-5.6 or ship with what's available now.
Wait if:
Ship now if:
The frontier keeps moving. Whatever you build today will need to handle model upgrades anyway. If your architecture is clean, switching to 5.6 when it drops should be a configuration change, not a rewrite.
One interesting deployment note: OpenAI announced that GPT-5.6 will be available on Cerebras inference hardware starting July 2026, with speeds up to 750 tokens per second. For latency-sensitive applications, that's a meaningful improvement over standard deployment.
This suggests OpenAI is expanding its inference partnerships, which could affect pricing and availability for high-volume customers.
GPT-5.6 Sol represents a meaningful step forward for agentic and coding workloads, with the Terminal-Bench 2.1 numbers showing real improvement over GPT-5.5 and competitive positioning against Claude Mythos 5. The three-tier pricing structure (Sol/Terra/Luna) gives developers clearer cost-to-capability tradeoffs than previous generations.
The frustrating part is availability. A limited preview with government access restrictions means most developers are waiting with no clear timeline. If you need frontier capabilities today, GPT-5.5 Pro and Claude Fable 5 are your options.
My recommendation: prepare your codebase for easy model swaps, build evaluation harnesses against your real traffic, and ship with what works now. When GPT-5.6 opens up, you want the migration to be a single-line config change, not a scramble.
OpenAI says "in the coming weeks" but has not announced a specific date. The limited preview began June 26, 2026 with around 20 approved organizations. General availability will likely roll out by subscription tier (Plus, Pro, Team, Enterprise) once government restrictions lift.
The U.S. government requested vetting of approved organizations before broad deployment due to the model's capabilities in cybersecurity vulnerability research and biology analysis. This is a safety measure, not a capacity constraint.
Sol ($5/$30 per MTok) is priced higher than GPT-5.5 for flagship capability. Terra ($2.50/$15) is positioned at roughly half the cost of GPT-5.5 with near-equivalent performance. Luna ($1/$6) is the budget tier for high-volume, latency-sensitive workloads. The 90% cached read discount makes caching significantly more attractive.
Sol Ultra is a high-effort variant that pushes maximum reasoning capability through extended compute and subagent-based decomposition. On Terminal-Bench 2.1, Ultra scores 91.9% versus Sol base at 88.8%. Use Ultra for the hardest agentic and reasoning tasks where cost is secondary to correctness.
Ship with GPT-5.5 or Claude alternatives if you have a product deadline. The availability timeline is uncertain and GPT-5.5 is production-ready. Build your architecture to support easy model swaps so you can migrate quickly when GPT-5.6 opens up.
OpenAI has not officially published the context window size for GPT-5.6. Unofficial reports mention 1.5 million tokens, but verify against the official documentation when it updates.
GPT-5.6 is available through Codex for the limited preview organizations with government approval. General Codex access will expand with broader API availability.
On Terminal-Bench 2.1, Sol base scores 88.8% and Sol Ultra scores 91.9%, compared to Claude Mythos 5 at 88.0%. For agentic coding, both are strong choices. Claude Fable 5 was restored on July 1, 2026 and is immediately available, while GPT-5.6 access is restricted.
Read next
GPT-5.5 and 5.5 Pro hit the API on April 24. Here is what changes for builders: pricing, agentic tasks, tool-use, and the real benchmarks I ran the day it dropped.
11 min readCodex works from the terminal, cloud tasks, IDEs, GitHub, Slack, and Linear. Here is how to use it and how it compares to Claude Code.
5 min readEvery major AI coding tool just went through a pricing shift. Here are the exact numbers for Cursor, GitHub Copilot, Claude Code, Devin, and the Anthropic API - verified from live pricing pages on July 4, 2026. Claude Sonnet 5 is now the default model with promotional pricing through August 31.
9 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
AI app builder - describe what you want, get a deployed full-stack app with React, Supabase, and auth. No coding requi...
View ToolThe TypeScript toolkit for building AI apps. Unified API across OpenAI, Anthropic, Google. Streaming, tool calling, stru...
View ToolOpenAI's coding agent for terminal, cloud, IDE, GitHub, Slack, and Linear workflows. Reads repos, edits files, runs comm...
View ToolOpen-source terminal agent runtime with approval modes, rollback snapshots, MCP servers, LSP diagnostics, and a headless...
View ToolAI app generator. Describe what you want and get a working app in minutes.
View AppSee exactly what your agent did, locally. No cloud, no signup.
View AppBeat the August 2026 Assistants API sunset. Paste old code, get Responses API.
View AppWhat MCP servers are, how they work, and how to build your own in 5 minutes.
AI AgentsStep-by-step guide to building an MCP server in TypeScript - from project setup to tool definitions, resource handling, testing, and deployment.
AI AgentsInteractive timeline showing what's in context at each turn.
Claude Code
In this video, we explore Rich Sutton's 'Bitter Lesson' and its implications for the future of software development, particularly as we approach 2026. We discuss the key principles from Sutton's...

Learn The Fundamentals Of Becoming An AI Engineer On Scrimba; https://v2.scrimba.com/the-ai-engineer-path-c02v?via=developersdigest OpenAI's New O1 Model and $200/Month ChatGPT Pro Tier: What's...

In this video, I'll guide you through creating an AI-powered web scraping system using OpenAI's new structured outputs and Bright Data's web unlocker feature. By the end of this tutorial, you'll...

GPT-5.5 and 5.5 Pro hit the API on April 24. Here is what changes for builders: pricing, agentic tasks, tool-use, and th...

Codex works from the terminal, cloud tasks, IDEs, GitHub, Slack, and Linear. Here is how to use it and how it compares t...

Every major AI coding tool just went through a pricing shift. Here are the exact numbers for Cursor, GitHub Copilot, Cla...

A developer's comparison of OpenAI and Anthropic ecosystems - models, coding tools, APIs, pricing, and which to choose f...

Anthropic shipped Fable 5 and a June 22 subscription cliff. OpenAI shipped GPT-5.5 inside Codex plus automations, browse...

OpenAI's June 2026 API changelog looks like scattered platform plumbing. Read together, moderation scores, workload iden...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.