Token Estimator

Count tokens and see context-window usage for Claude, GPT, and Llama.

Paste a prompt, doc, or transcript. Get a live token count for each major model and a color-coded utilization bar so you know how much headroom you have before truncation hits.

Paste prompt or document356 chars / 61 words

GPT-5

Healthy

o200k_base tokenizer - 256,000 ctx

79tokens0.03%

GPT-4o

Healthy

o200k_base tokenizer - 128,000 ctx

79tokens0.06%

GPT-4 / 3.5

Healthy

cl100k_base tokenizer - 128,000 ctx

80tokens0.06%

Claude Opus / Sonnet

Healthy

cl100k baseline x1.10 - 200,000 ctx

88tokens0.04%

Llama 3 / 3.3

Healthy

cl100k baseline x1.05 - 128,000 ctx

84tokens0.07%

Truncation Visualizer

Compare how your input fits each model's context window. The colored bar is your input; the rest is headroom for tool calls, reasoning, and the model's response.

GPT-579 / 256,000

GPT-4o79 / 128,000

GPT-4 / 3.580 / 128,000

Claude Opus / Sonnet88 / 200,000

Llama 3 / 3.384 / 128,000

Visualizing against GPT-5: 79 of 256,000 tokens used. Over 80% leaves very little room for output - plan to chunk, summarize, or use prompt caching.

Methodology

Counts come from gpt-tokenizer, a zero-dependency BPE encoder. GPT-4o and GPT-5 use o200k_base; GPT-4 / 3.5 use cl100k_base.

Anthropic and Meta use proprietary or differently-sized tokenizers. We approximate Claude with cl100k_base x 1.10 and Llama 3 with cl100k_base x 1.05 based on average English text comparisons. For production billing, always count with the official SDK.

FAQ

What does the token estimator do?+

It counts how many tokens any pasted text uses across Claude, GPT-4, GPT-5, and Llama, and shows how much of each model's context window you're consuming. This helps you plan prompts, avoid truncation, and forecast cost before you send a request.

How accurate is the token count?+

GPT-family counts are exact - we use the same BPE tokenizer (cl100k_base for GPT-4/3.5 and o200k_base for GPT-4o/GPT-5) the OpenAI SDK uses, via the open-source gpt-tokenizer package. Claude and Llama counts are estimates based on cross-tokenizer ratios; for billing-accurate counts use the official Anthropic SDK or Meta tokenizer.

What are the context windows for each model?+

Claude Opus 4.6, Sonnet 4.6, and GPT-4 Turbo are around 200K tokens. GPT-5 is approximately 256K tokens. GPT-4o is 128K. Llama 3.3 70B is 128K. Gemini 2.5 Pro reaches 1M tokens. Always check the latest model card before relying on these numbers in production.

Why does the bar change color?+

Below 50% utilization is healthy (black bar). 50-80% is moderate (yellow). 80-95% is high (orange) - you have very little room for the model's reply. Over 95% is critical (red) - the model will likely truncate your input or refuse the request.

Is this tool free?+

Yes. The estimator runs entirely in your browser - your text never leaves your machine. There is no signup, no API key, no rate limit.

How is this different from the AI cost calculator?+

The token estimator counts tokens for a specific piece of text and visualizes context-window utilization. The AI cost calculator estimates monthly spend across providers given expected request volume. Use the estimator when sizing a single prompt, and the calculator when planning a workload.

Get Smarter About AI Dev

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.

One email per weekReal code, not theoryFree forever

Truncation Visualizer

Compare how your input fits each model's context window. The colored bar is your input; the rest is headroom for tool calls, reasoning, and the model's response.

GPT-579 / 256,000

GPT-4o79 / 128,000

GPT-4 / 3.580 / 128,000

Claude Opus / Sonnet88 / 200,000

Llama 3 / 3.384 / 128,000

Visualizing against GPT-5: 79 of 256,000 tokens used. Over 80% leaves very little room for output - plan to chunk, summarize, or use prompt caching.

FAQ

What does the token estimator do?+

How accurate is the token count?+

What are the context windows for each model?+

Why does the bar change color?+

Is this tool free?+

Yes. The estimator runs entirely in your browser - your text never leaves your machine. There is no signup, no API key, no rate limit.

How is this different from the AI cost calculator?+

Token Estimator

GPT-5

GPT-4o

GPT-4 / 3.5

Claude Opus / Sonnet

Llama 3 / 3.3

Truncation Visualizer

Related Reading

Prompt engineering for coding

AI cost calculator

Token counter (heuristic)

Want more AI engineering deep-dives?

FAQ

Get Smarter About AI Dev

Token Estimator

GPT-5

GPT-4o

GPT-4 / 3.5

Claude Opus / Sonnet

Llama 3 / 3.3

Truncation Visualizer

Related Reading

Prompt engineering for coding

AI cost calculator

Token counter (heuristic)

Want more AI engineering deep-dives?

FAQ

Get Smarter About AI Dev