Claw Mart
← All issuesClaw Mart Daily
Issue #64June 7, 2026

Your agent needs a context window budget (or it'll crash mid-conversation)

Your agent was humming along perfectly. Then it hit message 47 of a coding session and suddenly started asking you to re-explain the entire project. Sound familiar?

You just hit the context window wall. And unlike running out of API credits, this one kills momentum instantly.

Most people think context windows are just a number — 200K tokens, 1M tokens, whatever. But in practice, your agent needs a context budget that accounts for three hidden costs:

  • Retrieval overhead — Your agent pulls in docs, previous conversations, and tool outputs. That's 20-40% of your window before you even start.
  • Working memory — Complex tasks need scratch space. Code generation, planning, debugging — all eat tokens fast.
  • Conversation drift — Long sessions accumulate context that stops being useful but never gets pruned.

Here's the pattern that actually works: rolling context with anchor points.

# Context Budget Manager
CONTEXT_LIMIT = 150000  # Leave 50K buffer on 200K window
ANCHOR_THRESHOLD = 0.8   # Summarize at 80% capacity

def manage_context(conversation_history, current_tokens):
    if current_tokens > (CONTEXT_LIMIT * ANCHOR_THRESHOLD):
        # Create anchor point
        summary = create_session_summary(conversation_history)
        
        # Keep only: system prompt + anchor + recent messages
        pruned_history = [
            conversation_history[0],  # System prompt
            {"role": "assistant", "content": f"Session summary: {summary}"},
            *conversation_history[-10:]  # Last 10 exchanges
        ]
        
        return pruned_history
    
    return conversation_history

The key insight: summarize the middle, keep the edges. Your agent needs the original instructions (system prompt) and recent context (last few exchanges), but everything in between can be compressed into a summary.

Pro tip: Build anchor points around natural breakpoints — completed tasks, major decisions, or context switches. Don't just count tokens.

For coding agents, this is critical. You want to preserve:

  • The original project spec and requirements
  • Current file structure and key decisions
  • Recent code changes and their rationale
  • Active debugging context

But you can safely compress:

  • Completed feature discussions
  • Resolved bugs and their solutions
  • Exploratory code that got scrapped
  • Verbose tool outputs from earlier in the session

The result? Your agent maintains coherence across marathon coding sessions without suddenly forgetting what you're building.

I learned this the hard way during a 6-hour refactoring session. My agent kept losing track of the migration plan and asking me to re-explain database schemas it had been working with for hours. Now I run all my coding agents with rolling context, and they stay sharp from start to finish.

If you're running long coding sessions and hitting context limits, you need a system that handles this automatically. The Ralph loops in our coding agent setup include context management that keeps sessions running smoothly, even when you're deep in complex refactors.

Paste into your agent's workspace

Claw Mart Daily

Get tips like this every morning

One actionable AI agent tip, delivered free to your inbox every day.