Background GradientBackground Gradient
Inkeep Logo
← Back to Blog
AI Agents
October 1, 2025

Fighting Context Rot: The Essential Skill to Engineering Smarter AI Agents (According to Anthropic)

Context engineering has emerged as the critical skill for building capable AI agents. Learn how to manage attention budgets, implement just-in-time retrieval, and master the techniques for building agents that know when to remember and when to forget.

Fighting Context Rot: The Essential Skill to Engineering Smarter AI Agents (According to Anthropic)

Key Takeaways

  • Context engineering has replaced prompt engineering as the critical skill for building capable AI agents

  • AI agents have limited attention budgets—every unnecessary token actively degrades performance

  • Just-in-time context retrieval outperforms loading all data upfront

  • Three proven strategies enable long-horizon tasks: compaction, structured note-taking, and multi-agent architectures

  • Every tool in your agent's arsenal must earn its place in the context window

The art of building AI agents has fundamentally changed. It's no longer about finding the perfect prompt. Rather, it's about preserving context for orchestrating an entire information ecosystem.

Context engineering has emerged as the critical skill for building capable AI agents, shifting our focus from crafting perfect prompts to managing the complete information environment that powers intelligent behavior.

This evolution reflects a simple reality: as AI agents tackle more complex, multi-step tasks, success depends less on clever phrasing and more on strategically curating what information enters the model's limited attention window. Anthropic's latest research reveals that effective agents require us to think holistically about context—system prompts, tools, examples, message history, and runtime data retrieval all compete for the same finite resource.

The core challenge? Finding the smallest possible set of high-signal tokens that maximize the likelihood of desired outcomes. Every unnecessary word, every redundant tool description, every piece of stale data actively degrades your agent's performance.

Why It Matters

AI Agents have an attention budget, and it's a smaller budget than you might think.

Research on context degradation reveals a sobering truth: as context windows fill, model accuracy drops. This "context rot" isn't a bug, it's actually an architectural reality. The transformer architecture that powers modern LLMs creates n² pairwise relationships between tokens. At 10,000 tokens, that's 100 million relationships to track. At 100,000 tokens? 10 billion.

The implications are stark:

  • Information retrieval accuracy decreases as contexts grow longer
  • Long-range reasoning suffers when attention gets stretched thin
  • Agent coherence breaks down without careful context management

Like human working memory, LLMs lose focus when overwhelmed.

The difference? Humans naturally filter and prioritize. But AI agents need you to engineer that filtering for them.

The Right Altitude

Effective context engineering operates at the "Goldilocks zone" between two failure modes.

  • Too prescriptive: Hardcoding complex if-else logic into prompts creates brittle agents that break when encountering edge cases. We've all seen 2,000-word system prompts trying to anticipate every scenario.
  • Too vague: High-level guidance like "be helpful and accurate" provides no concrete signals for desired behavior. The model fills in the gaps with assumptions that may not match your intent.

The sweet spot: Specific enough to guide behavior, flexible enough to adapt. Structure your prompts with clear sections. Use XML tags or markdown headers. Provide diverse, canonical examples that demonstrate expected behavior rather than listing edge cases.

Just-in-Time Context

The most sophisticated agents don't load everything upfront because they retrieve information precisely when needed.

Traditional approaches dump all potentially relevant data into context before inference. Modern agents take a different path: they maintain lightweight references (file paths, database queries, API endpoints) and dynamically load information at runtime.

Claude Code exemplifies this approach. When analyzing large databases, it doesn't load entire datasets. Instead, it:

  • Writes targeted SQL queries to extract specific data
  • Uses bash commands like head and tail to sample files
  • Maintains a working set of only the most relevant information

This mirrors human cognition. You don't memorize entire libraries—you organize information and retrieve it on demand. File systems, bookmarks, and search queries are your tools for just-in-time context. Your agents should work the same way.

Progressive disclosure becomes your friend. Each interaction yields context that informs the next decision. File sizes suggest complexity. Naming conventions hint at purpose. Timestamps indicate relevance. Agents assemble understanding layer by layer, maintaining only what's necessary in working memory.

Techniques for Long Horizons

Three proven strategies enable agents to work effectively beyond context window limits.

1. Compaction When approaching context limits, summarize and reinitialize. Claude Code demonstrates this elegantly—it compresses conversation history while preserving architectural decisions, unresolved bugs, and key implementation details. The art lies in knowing what to keep versus what to discard.

Start with tool result clearing. Once a tool has been called deep in message history, the raw output rarely needs to be seen again. This simple optimization can recover thousands of tokens without losing functionality.

2. Structured Note-Taking Give your agents external memory. A simple NOTES.md file or structured todo list enables persistence across context resets. Claude playing Pokémon maintains precise tallies across thousands of game steps—tracking training progress, remembering combat strategies, maintaining maps of explored regions. After context resets, it reads its own notes and continues multi-hour sequences seamlessly.

3. Multi-Agent Architectures Complex tasks benefit from specialized sub-agents with clean contexts. The lead agent maintains high-level coordination while sub-agents handle focused work. Each sub-agent might use tens of thousands of tokens exploring solutions but returns only condensed summaries (1,000-2,000 tokens). This separation of concerns prevents context pollution while enabling deep technical work.

Tools That Work

Every tool in your agent's arsenal must earn its place in the context window.

Avoid tool proliferation. I myself am sadly guilty of this when crafting my own agents as the temptation to just give it everything and "let it figure things out" is frankly too real.

If a you (the prompt engineer) can't definitively say which tool to use in a given situation, neither can your agent. Each tool should be:

  • Self-contained: Complete functionality without dependencies
  • Unambiguous: Clear, non-overlapping purposes
  • Token-efficient: Minimal descriptions, focused parameters
  • Error-robust: Graceful handling of edge cases

The goal isn't comprehensive coverage—it's optimal coverage. Five well-designed tools outperform twenty overlapping ones.

What's Next

Start implementing context engineering today:

  1. Audit existing prompts - Cut 30% of words without losing meaning
  2. Implement just-in-time retrieval - Store references, not content
  3. Add structured note-taking - A simple markdown file transforms capabilities

The Bottom Line

Context engineering represents a fundamental shift in how we build AI systems. As models become more capable, the limiting factor isn't intelligence—it's attention management.

Treat every token as precious. Engineer information flow, not just instructions. Build agents that know when to remember and when to forget.

Smarter context is now necessity.

How Inkeep Helps You Master Context Engineering

At Inkeep, we've built our AI agent platform with context engineering at its core. Our agents use dynamic context fetchers and request context to retrieve information precisely when needed.

Dynamic Context Management: Our Context Fetchers embed real-time data from external APIs into agent prompts, dynamically retrieving fresh data for each conversation rather than hardcoding information. Agents can fetch data on initialization or per invocation, with built-in data transformation using JSONPath notation.

Multi-Agent Architecture: Deploy specialized agents within graphs that transfer control or delegate subtasks to each other. Each agent maintains focused contexts for specific tasks while collaborating seamlessly through our graph system.

Request Context Integration: Pass dynamic context via HTTP headers for personalized interactions. Context values are validated, cached per conversation, and made available throughout your agent system for context fetchers, external tools, and agent prompts.

Structured Memory Through Artifacts: Agents can save information from tool call results as artifacts, making it available to other agents and users. This enables persistent memory and knowledge sharing across agent interactions.

Whether you're building customer support automation, internal knowledge assistants, or complex multi-agent systems, Inkeep provides the infrastructure to implement dynamic context management with features like context fetchers, request context validation, and multi-agent coordination.

Frequently Asked Questions

Context rot is the degradation of model accuracy as context windows fill up. As more tokens are added, the transformer architecture struggles to track relationships between all tokens, leading to decreased information retrieval accuracy and long-range reasoning capabilities.

Instead of loading all data upfront, maintain lightweight references (file paths, database queries, API endpoints) and dynamically load information at runtime. Use targeted queries, sampling commands, and maintain only the most relevant information in working memory.

There's no single ideal length, but the goal is to find the smallest possible set of high-signal tokens that maximize the likelihood of desired outcomes. Every unnecessary word, redundant tool description, or piece of stale data degrades performance.

Complex tasks benefit from specialized sub-agents when a single context would become too polluted. The lead agent maintains high-level coordination while sub-agents handle focused work in clean contexts, returning only condensed summaries.

Each tool should be self-contained, unambiguous, token-efficient, and error-robust. Five well-designed tools outperform twenty overlapping ones. If you can't definitively say which tool to use in a given situation, neither can your agent.

📚

Explore More About AI Agents

This article is part of our comprehensive coverage on ai agents. Discover related insights, implementation guides, and foundational concepts.

View all AI Agents articles

See Inkeep Agents in actionfor your specific use case.