A Bigger Context Window Won't Save You

When an agent starts flailing on real code, the reflex is to give it more. Paste in more files. Upgrade to the model with the million-token window. Dump the whole repository in and let it sort things out. It feels like the obvious move, and it mostly backfires. An agent handed everything attends to nothing in particular, and the one detail that actually mattered gets buried under thousands that didn't.

The skill that fixes this isn't prompting. It's context engineering: managing what enters the model's window as the scarce, decisive resource it is. Here's what that actually looks like.

The window is a budget, not a backpack

A language model starts every request from nothing. Everything it knows about your task, your instructions, the relevant code, the conventions, the prior steps, has to fit inside one finite context window. That window is a budget, and it is far smaller than your codebase.

The counterintuitive part is that filling it doesn't help. Models attend unevenly to long inputs, and detail parked in the middle of a huge context tends to get lost. Past a point, more tokens buy you worse performance, not better, as the signal you care about drowns in noise. So the goal is never to cram in the most context. It's to get in the right context and keep everything else out. A longer window raises the ceiling on what you can include. It does nothing to tell you what you should.

Persistent context: the things that are always true

Some context should never have to be rediscovered. Your architecture, your stack, the code conventions, the security non-negotiables, the modules a new feature should reuse: these are true on every task, and making the agent re-infer them each time is both wasteful and unreliable.

The move is to write them down once as durable, always-loaded ground truth, often called a constitution, and anchor the agent to it on every request. An agent that carries your conventions and your hard rules into each feature without being reminded behaves consistently across a whole project instead of improvising a new style every session. This is the cheapest, highest-leverage thing most teams aren't doing. A page of stable, authoritative project rules outperforms paragraphs of per-task instructions, because it governs everything rather than just the task in front of you.

Retrieval: the right files, not all the files

The problem is rarely that the model doesn't know your code. It's that out of thousands of files, only a handful matter for the change in front of you, and the model has no way to know which. Surfacing those few is a retrieval problem, and it is the heart of working in a real codebase.

The reinvention failure makes this concrete. You ask for a feature, and the agent writes its own date helper, its own validation, its own API client, blind to the three you already have. That is not the model being dim. It is the existing code never entering the window, so as far as the agent could tell, it didn't exist. The fix is to point the agent at the exemplar to follow and name the modules to reuse, explicitly, so the right context is present at the moment it decides. Anchor the work to what already exists and the agent extends your system. Leave it to guess and it bolts a foreign one onto the side.

Hygiene: keep the window clean as you go

Context engineering isn't a one-time setup you do at the start of a task. It's continuous. As an agent works, abandoned attempts, dead-end reasoning, and stale output pile up in the window, and that residue quietly degrades everything that comes after, because the model is now reasoning over its own discarded mistakes as if they were still relevant.

So you prune. You clear what's no longer load-bearing, and when a thread is genuinely exhausted, you start fresh rather than dragging its full history forward. A clean window beats a full one almost every time. Treating the context as something you maintain, not something you fill once and forget, is what keeps quality from sliding as a session runs long.

It's half the method

Context engineering is one half of getting real work out of an AI agent: control what it sees. The other half is the specification: control what it builds, by giving it a target it cannot misread. Apart, each helps. Together they turn an agent from an unpredictable intern into something you can actually direct, because it is both well-informed and aimed at the right thing.