Context Window Visualizer
Watch a finite container fill in real time. See what happens when it does.
Nothing lost yet.
The 2K window above is a teaching tool — real models have context windows orders of magnitude larger. Here is what the actual numbers look like, and why they still fill faster than you expect.
Real context windows, April 2026
| Platform | Tier | Context Window | Price |
|---|---|---|---|
| ChatGPT (OpenAI) | Free | Varies — unpublished | $0/mo |
| ChatGPT | Plus | 256,000 tokens | $20/mo |
| ChatGPT | Pro | 400,000 tokens | $200/mo |
| Claude (Anthropic) | Free | 200,000 tokens | $0/mo |
| Claude | Pro | 200,000 tokens (1M opt-in) | $20/mo |
| Claude | Max | 1,000,000 tokens | $100–200/mo |
| Gemini (Google) | Free | 1,000,000 tokens | $0/mo |
| Gemini | Advanced | 1,000,000 tokens | $20/mo |
Data verified April 2026. Context window sizes change frequently — check provider pricing pages for current values.
200,000 tokens is approximately 150,000 words. Around 500 pages of dense text. One million tokens is roughly the length of War and Peace. Three times over. I mention this not to impress you, but because you need to understand the scale before the next part makes sense.
Here is the thing most people do not realize: every message you send includes the entire conversation so far. Turn 10 does not cost one turn's worth of tokens. It costs the sum of turns 1 through 10, all sent again. The window fills like an hourglass — not a bucket with a faucet.
Add a long AI response (often 3–10× the length of your message), throw in a document you uploaded, and that hour-long technical session can hit the wall faster than you expect. When it does, older context gets dropped silently. No warning. The model just continues — helpful, confident, and missing everything that fell off.
See how fast it fills
Two rules worth remembering.
A quick question or a 10-minute chat? Do not even consider worrying about it. You are nowhere near any context limit on any platform.
An hour of back-and-forth on a real project — especially with detailed AI responses, uploaded documents, or pasted code — and you should pay attention. The danger zone is all three together: long session, long responses, large documents. That is when things go missing without warning.
A few habits that help: Start a fresh conversation when you genuinely shift to a new topic. If you are deep in a long session, summarize what you have established and paste that summary into a new chat rather than continuing indefinitely. Even easier — tell the model you are worried about exceeding the context window and ask it to summarize everything into a prompt you can paste into the new conversation. Simple, quick, and effective. Treat uploaded documents as expensive — they ride along on every subsequent turn. And when the model seems inconsistent about something it explained earlier, the context window is often the explanation.
Technical note: All major consumer AI platforms (Claude, ChatGPT, Gemini) use transformer-based architectures that process the full conversation history on each request. Modern implementations use KV caching to optimize computation speed, but this does not reduce token consumption — the full prior context is still present and counted against your window. Hybrid architectures exist that handle long context differently, but they are not the platforms described in this table. Learn more about transformer architecture and KV caching →