Claude Code Keeps Running Out of Context — How to Fix It (2026)
Last updated: May 12, 2026.
You’re halfway through a refactor, three files deep, and Claude Code pops up a “context low” warning. Or worse: it quietly summarizes the conversation and forgets a decision you made twenty minutes ago, so now it’s “fixing” the thing you already fixed. It always happens at the wrong moment. Below is why it happens and how to make it stop, ordered by what gives you the most room back the fastest.
Why this keeps happening
Claude Code keeps everything from your session in one place: every message, every file it read, every command’s output. On most models that’s a window of roughly 200,000 tokens. Sounds enormous until you notice what fills it.
Long conversations are the obvious one. Every exchange stays in there. A two-hour session with a lot of back-and-forth is a lot of tokens.
Big file reads are the sneaky one. Ask Claude to “take a look at the auth module” and it might pull in a 2,000-line file in full. That’s gone now, sitting in context for the rest of the session.
Then there’s tool output. A complete npm run build log, a giant git diff, an accidental ls of node_modules — all of it lands in context verbatim. MCP servers can be the worst offenders here: a database server that returns 500 rows, or a docs server that hands back an entire API reference page, can blow a hole in your budget in a single call. (If you haven’t set any of those up yet, our walkthrough of how MCP servers work covers what they actually do.)
And finally, the logs you paste “just in case.” That 400-line stack trace you dropped in? Those are 400 lines Claude is now carrying around.
When the window gets close to full, Claude Code auto-compacts: it summarizes what’s happened so far and keeps going with the summary instead of the raw history. Your session survives, which is the point. But summaries lose things. The exact line numbers. The reason you rejected the first approach. The “we agreed to use middleware, not decorators” decision. So the real goal isn’t just avoiding a crash — it’s staying in control of what Claude actually remembers.
The fixes, fastest first
Here’s the short version. The details are below.
| What | When you reach for it | Effort |
|---|---|---|
/clear | Switching to an unrelated task | Instant |
/compact with an instruction | Mid-task, need room, want to keep specifics | Instant |
| One task per session | Always — it’s preventative | A habit |
| Point Claude at exact paths | Every request | A habit |
Keep a CLAUDE.md | Once per project | 10 minutes |
| Hand big searches to subagents | Heavy exploration | Per task |
| Quiet down noisy MCP servers | One-time setup | One-time |
| Don’t paste big logs — save to a file | Anytime you’d paste 50+ lines | A habit |
/context to see what’s eating tokens | When the warning surprises you | 5 seconds |
| The 1M-token window | Genuinely huge codebases, last resort | Plan-dependent |
/clear between unrelated things
You just finished the login bug. Now you’re starting on the CSV export. Type /clear. It dumps the conversation and gives you a full window again. There is no reason to drag login-debugging history into export work. Honestly, most “I keep running out of context” complaints are really “I never cleared between three different jobs.” This is the habit that fixes it.
/compact, and tell it what to keep
When you’re still in the middle of something and just need breathing room, /compact it. Left alone, it summarizes the whole conversation and continues. Better to give it a hint:
/compact keep the auth refactor details — which files changed, the new token flow, and the decision to use middleware not decorators
That instruction steers what survives. Without it, Claude decides what’s important. With it, you do. Reach for /compact when you need continuity, /clear when you don’t.
One task per session
A session that starts as “fix this failing test” and grows into “also refactor the API client, oh and update the docs, and let’s talk about the deploy pipeline” is going to run out of room — and the work gets worse on the way down, too. One session, one task. Done? /clear. If you want to carry something forward, ask for an end-of-session summary (there’s a pattern for this in our Claude Code tips post) and paste it as the first message of the next session.
Be specific so it reads less
“Fix the bug in the checkout flow” sends Claude exploring: it reads the cart component, the checkout component, the payment service, the order model, probably three of them in full. “In src/checkout/PaymentForm.tsx, validateCard rejects valid Amex numbers — the regex is too strict” gets it to read one function. Same fix, a fraction of the tokens. Whenever you know the path and the symbol, say so.
A CLAUDE.md so it doesn’t re-learn your project every time
Drop a CLAUDE.md in the repo root with your conventions, a quick architecture note, and your “always do X” rules. Claude reads it automatically at the start of every session, so it doesn’t have to go spelunking to discover that you use Swift 6 concurrency or that tests live under Tests/. You pay for that context once, cheaply, instead of re-burning it every session. The 17 Claude Code tips post goes into what’s worth putting in there.
Push big searches to a subagent
“Find every place we still call the legacy auth endpoint” is exactly the kind of noisy, file-heavy job that doesn’t belong in your main context. Hand it to a subagent via the Task tool. The subagent does the grepping and reading in its own throwaway window and comes back with just the answer. Your session stays lean. Works the same for “audit all our error handling” or “list every component importing this deprecated module.”
Turn down the chatty MCP servers
MCP servers are great, but a verbose one is a tax you pay on every call. If your database server returns 200 columns when you wanted three, fix the query. If you’ve got a docs server, a Jira server, a Slack server, and a Postgres server all loaded but you’re only touching one today, disable the rest for this session. Some servers let you cap result size — do it. Our best MCP servers roundup flags which ones behave themselves about output.
Don’t paste the whole log
Pasting a 600-line build log puts all 600 lines in context, permanently. Instead, send it to a file (npm run build &> build.log) and tell Claude “the build failed — read the last 40 lines of build.log.” Or filter it before it ever reaches the chat: npm test 2>&1 | grep -A5 -i fail. Claude Code reads your terminal output natively anyway, so even just running the failing command and letting it pick up the tail beats copy-pasting the lot.
/context when the warning blindsides you
If the warning shows up earlier than you expected, run /context (and keep an eye on the context indicator in the status line). It shows you how the window is carved up: system prompt, CLAUDE.md, MCP tool definitions, conversation, file reads. Usually the culprit is obvious — one enormous file read, or a stack of MCP servers whose tool schemas alone are eating 15K tokens before you’ve said a word. /cost is worth a glance too; token spend and context pressure tend to track each other.
The 1M-token window — it exists, don’t lean on it
Some Claude models and plans offer a 1M-token context window. It’s real and it genuinely helps with large codebases. But it costs more per token, the model can still get a bit lost in the middle of a giant context, and it does nothing about the underlying habit of letting sessions sprawl. Treat it as headroom for a hard problem, not as permission to skip everything above.
/clear vs /compact, in one table
/clear | /compact | |
|---|---|---|
| What it does | Wipes the conversation; full fresh window | Summarizes the conversation; continues from the summary |
| What you keep | Nothing. Clean slate. | The gist. Specifics can vanish. |
| Use it when | Starting something unrelated | Mid-task, low on room, want to keep going |
| Can you steer it? | No | Yes — /compact <what to keep> |
| The risk | Clearing too eagerly and losing useful context | The summary drops a detail you needed |
If you remember one thing: different task, /clear; same task, no room, /compact with an instruction. And if you’re about to do something where you’ll definitely want the earlier detail, write it down somewhere first.
When auto-compaction bites you
Auto-compaction is fine when what you need going forward is the gist. “Keep building the feature we’ve been working on” survives it without trouble. Where it hurts is when you need precise earlier detail: the exact diff from step two, the reason approach A was a dead end, the line numbers you were about to touch. A summary papers over those.
So before it triggers — you’ll usually get the warning, or you can watch the indicator creep up — checkpoint the stuff that matters. Ask Claude: “Summarize what we’ve changed so far, every file modified with a one-line note, plus the key decisions, and write it to NOTES.md.” Or drop the decisions straight into CLAUDE.md (“auth is middleware-based, not decorators”). Then clear or compact freely, because the record that matters is on disk now, not at the mercy of a summarizer. Treat the context window like RAM, not storage. Anything you’d hate to lose, write it down.
TL;DR
The context window is finite — around 200K tokens, 1M on some tiers — and Claude Code auto-compacts when it fills, which is lossy. Different task: /clear. Same task, out of room: /compact <what to keep>. To stop hitting it at all: be specific about file paths, keep one task per session, maintain a CLAUDE.md, push big searches to subagents, quiet down noisy MCP servers, and never paste a log you could’ve grepped first. Use /context to see what’s eating tokens. And checkpoint important state to a file before auto-compaction hits.
One caveat: slash commands and limits move between Claude Code versions, so if /compact or /context behaves differently than what’s here, run /help for your version’s current list.
If you’re also wondering whether a different tool handles long sessions better, our Claude Code vs Cursor vs Copilot comparison gets into how each one deals with context.
Want more out of Claude Code than just “stop running out of context”? 17 Claude Code Tips That 10x Your Productivity covers the CLAUDE.md patterns, subagent tricks, and custom slash commands — the same habits that keep your context lean in the first place. More in the AI Tools section.
Frequently Asked Questions
Is Claude Code's context window unlimited?
No. It runs on a fixed-size window, usually around 200K tokens, with a 1M-token option on some models and plans. When it gets close to full, Claude Code summarizes the conversation so far and keeps going, and that summary can lose detail.
What does /compact do in Claude Code?
It compresses the current conversation into a summary and continues from there, which frees up tokens. You can steer what it keeps by adding an instruction, like /compact keep the auth refactor details.
What's the difference between /clear and /compact in Claude Code?
/clear throws the conversation away and starts fresh. /compact keeps a summarized version and continues. Use /clear when you're switching to something unrelated; use /compact when you're mid-task and just need more room.
How big is Claude Code's context window?
On most Claude models it's about 200K tokens. Some models and plans offer a 1M-token window, but it costs more per token and doesn't replace good habits. Run /context to see what's actually using up your window right now.
Comments