Context Window Management: Feeding Your AI the Right Code

We've all done it. Something's broken, you're frustrated, and you just paste everything into Claude or GPT and hope it figures it out. Half your component tree, three utility files, a config you're not even sure is relevant, and a frantic message: 'why is this not working???'. The AI gives you a confident answer that's completely wrong because it was drowning in noise and latched onto the wrong thing. You waste 20 minutes. Sound familiar?

Context window management is the skill nobody talks about but separates people who get 10x productivity from AI tools from people who get frustrated and give up. It's not about which model you use. It's about what you put in front of it.

Why the Context Window Matters More Than the Model

Modern models — Claude 3.5 Sonnet, GPT-4o, Gemini 1.5 Pro — all have massive context windows. We're talking 128k to 1M tokens. This sounds like 'just throw everything in there' is finally a viable strategy. It's not. Bigger context window doesn't mean the model pays equal attention to everything in it. Research on this is pretty clear: models suffer from 'lost in the middle' problems where information buried in the middle of a huge prompt gets less attention than stuff at the beginning or end. You can have 200k tokens of context and the model still focuses on the last thing you said.

There's also the practical issue of cost and speed. Every token costs money. Every token adds latency. Stuffing 50 files into a prompt when only 3 are relevant is burning cash for worse results. We learned this building out our template infrastructure — we were using AI heavily for code review and our bills were embarrassing until we got disciplined about what we actually sent.

The Right Mental Model: Brief the AI Like a New Developer

Think about how you'd onboard a competent contractor to fix one specific thing. You wouldn't hand them your entire Git history. You'd say: here's what the system does at a high level, here's the specific piece we're working on, here's the constraint, here's what I've tried. That's exactly how you should structure AI context.

System context: What kind of app is this? What stack? What conventions matter?
Relevant files: Only the files that touch the problem directly
The problem statement: What should happen, what actually happens
Constraints: What you can't change (API contract, database schema, etc.)
What you've already tried: Stops the AI from suggesting the obvious thing you ruled out

That structure will outperform a raw dump of your codebase every single time. Five focused files beat fifty unfocused ones.

Practical Techniques That Actually Work

Here are the patterns we use daily. Not theory — actual workflows.

First: use a project context file. This is a markdown file you maintain at the root of your project that gives any AI (or human) a fast orientation to your codebase. We call ours CONTEXT.md. It covers stack, conventions, key abstractions, and anything non-obvious.

# Project Context

## Stack
- Next.js 14 App Router (not Pages Router)
- TypeScript strict mode
- Prisma + PostgreSQL
- Tailwind CSS
- NextAuth v5 (beta)

## Key Conventions
- Server Components by default, Client Components only when needed
- Database access only in Server Components or Route Handlers (never in Client Components)
- All forms use React Hook Form + Zod validation
- Error handling: we use a Result type pattern (see lib/result.ts), not try/catch everywhere

## Architecture Notes
- Auth: session-based via NextAuth, user object available via `getServerSession()`
- Multi-tenancy: all queries must be scoped to `organizationId` from session
- Email: Resend for transactional, templates in `emails/` directory

## What We're NOT Using
- No Redux or Zustand (server state from RSC, URL state from nuqs)
- No class components
- No `pages/` directory

You paste CONTEXT.md at the top of any AI conversation and suddenly you don't have to explain your stack every time. The AI knows your conventions and stops suggesting patterns that don't fit your project.

Second: surgical file selection. When you have a bug, identify the call chain. Don't include every file that imports from a shared utility — just the files in the specific path from request to response. For a broken API endpoint, that's probably the route handler, the relevant service function, the Prisma model, and the Zod schema. That's four files maximum, usually.

// Instead of pasting everything, paste only the call chain:

// 1. app/api/invoices/route.ts — the entry point
export async function POST(req: Request) {
  const session = await getServerSession(authOptions)
  if (!session) return new Response('Unauthorized', { status: 401 })
  
  const body = await req.json()
  const parsed = createInvoiceSchema.safeParse(body)
  if (!parsed.success) return Response.json({ error: parsed.error }, { status: 400 })
  
  const invoice = await createInvoice(session.user.organizationId, parsed.data)
  return Response.json(invoice)
}

// 2. lib/invoices.ts — the service function where the bug probably lives
export async function createInvoice(
  organizationId: string,
  data: CreateInvoiceInput
) {
  // ... the actual implementation
}

// 3. lib/schemas/invoice.ts — the validation schema
export const createInvoiceSchema = z.object({
  // ...
})

Third: strip irrelevant code. If your component has 300 lines but the bug is in the form submission handler, paste just the handler with enough surrounding context to understand it. Comments like '// ... rest of component' are fine. The AI doesn't need to read your JSX to debug your async function.

Managing Ongoing Conversations

Long conversations degrade. This is just how it works. After 20+ back-and-forth messages, the model's effective attention to your early context shrinks, it starts contradicting earlier decisions, and you get generic suggestions. We've burned so much time not recognizing this pattern.

The fix is to treat AI conversations like Git commits: small, focused, and frequent. Start a new conversation for each distinct problem. If you've been debugging one thing and want to pivot to a different feature, start fresh. At the start of the new conversation, include a 'summary so far' if decisions from the previous conversation are relevant.

A fresh conversation with good context beats a stale conversation with perfect history. Start over more often than feels natural.

For longer sessions where you genuinely need continuity — like architecting a feature over multiple hours — keep a running summary document. Every few exchanges, paste in what's been decided. 'We decided to use optimistic updates with useOptimistic, the mutation goes through this server action, and we're not using a loading spinner because the action is fast enough.' That summary is worth more than the full transcript.

Structuring Prompts for Code Tasks

The format of your prompt matters as much as the content. Here's a template we actually use:

## Context
[Paste CONTEXT.md or relevant subset]

## What I'm Working On
Adding invoice PDF generation. When a user clicks "Download PDF" on an invoice,
we need to generate a PDF server-side and stream it back.

## Relevant Code
[Paste route handler + invoice service + any relevant types]

## Specific Question
Should I use a React-to-PDF library running in a server action, or generate
the PDF directly with something like PDFKit? Constraint: we're on Vercel,
so no persistent file system. The PDFs need to be consistent with our
existing invoice template styling.

## What I've Looked At
- @react-pdf/renderer: seems good but I'm not sure about server component compatibility
- PDFKit: lower level, more control, but I'd have to manually recreate the layout

Notice what this does: it gives background without dumping everything, states the constraint explicitly (Vercel, no filesystem), and shows you've done homework. The AI can now give you a real answer instead of generic advice about PDF libraries.

When to Use AI Files vs. Inline Context

Tools like Cursor, GitHub Copilot, and Cline have their own context management systems. Cursor has .cursorrules, Cline has its own context files. These are different from what you manually paste into a chat window, but the principles are the same.

For IDE-integrated tools, your goal is to make the codebase itself self-documenting for the AI. This means good TypeScript types (the AI reads types as documentation), consistent naming conventions, and JSDoc comments on non-obvious functions. An AI reading a well-typed codebase can infer an enormous amount without you having to explain it.

// This is almost useless to an AI (and a human)
const process = async (data: any) => {
  const result = await db.query(data)
  return transform(result)
}

// This is a complete brief in itself
/**
 * Generates a usage report for an organization's billing period.
 * Only includes completed (non-draft) invoices.
 * Amounts are in the organization's configured currency.
 */
export async function generateBillingReport(
  organizationId: string,
  billingPeriod: { start: Date; end: Date }
): Promise<BillingReport> {
  const invoices = await db.invoice.findMany({
    where: {
      organizationId,
      status: { not: 'draft' },
      createdAt: { gte: billingPeriod.start, lte: billingPeriod.end }
    },
    include: { lineItems: true }
  })
  
  return aggregateInvoicesIntoBillingReport(invoices)
}

Good types and good naming mean the AI can understand your code from the call site without seeing the implementation. That's free context compression.

The Anti-Patterns That Kill AI Productivity

Pasting node_modules contents or auto-generated files (Prisma client output, .next build artifacts) — the AI doesn't need to see what it can infer
Including multiple unrelated bugs in one prompt — pick one problem, solve it, start fresh
Vague problem statements like 'this isn't working' without saying what you expected vs. what happened
Not including error messages — stack traces are high-signal context, include the full one
Asking the AI to read 2000 lines to find the bug rather than narrowing it down yourself first
Continuing a degraded conversation out of sunk cost — the 30 seconds to start fresh saves 30 minutes of bad output

That last one is the most expensive mistake. We've both caught ourselves doing it — you've invested 45 minutes in a conversation and the AI starts going in circles, but you keep pushing hoping it'll click. It won't. Start over.

The context window is a shared working memory between you and the AI. Your job is to be a good curator, not a data hose.

Making This Part of Your Workflow

The overhead of doing context management well is maybe 2-3 minutes per session. Maintaining CONTEXT.md takes 5 minutes whenever your architecture changes. The payoff is that you stop getting confidently wrong answers and start getting answers you can actually use.

For our templates at peal.dev, we actually ship a starter CONTEXT.md with each template — pre-filled with the actual stack decisions, conventions, and architecture notes for that template. When you start a new project from a template, the AI context is already set up. It's a small thing but it saves the first hour of 'teaching the AI about your project' every single time.

The meta-skill here is treating AI interactions as engineering problems, not magic tricks. Input quality determines output quality. Garbage in, garbage out — but the garbage is subtler than you think. It's not obviously bad code. It's irrelevant-but-plausible-looking code that sends the AI down the wrong path with complete confidence.

Get disciplined about what you put in the window. Your future self — the one not debugging AI hallucinations at 2am — will thank you.