articleintermediate

Context engineering: memory, compaction, and tool clearing Mar 2026 • Tools Agent Patterns Compare context engineering strategies for long-running agents and learn when each applies, what it costs, and how they compose.

March 31, 2026cookbook

This cookbook teaches context engineering strategies for long-running AI agents, focusing on three key techniques: compaction (summarizing context), tool-result clearing (removing re-fetchable tool outputs), and memory (persistent external storage). The guide addresses context rot—the degradation of model performance as context windows grow—and provides practical implementations using Claude's API. Through a research agent example, it demonstrates how to combine these strategies to manage token growth, maintain conversation continuity, and persist knowledge across sessions.

Key Points

•Context rot occurs as token count increases, reducing model's ability to recall information accurately before hitting hard token limits
•Compaction distills context into high-fidelity summaries, allowing agents to continue with minimal performance degradation on long conversations
•Tool-result clearing removes old, re-fetchable tool outputs (file reads, API responses) while maintaining call records to reduce context bloat
•Memory implements structured note-taking via persistent external storage, enabling agents to track progress across tasks and sessions without keeping everything in active context
•All three strategies have first-party API support: server-side compaction, context editing (tool-result clearing), and the memory tool
•Map workload characteristics to the right primitive: use clearing for large re-fetchable results, compaction for long conversations, memory for cross-session persistence
•Claude Code uses multiple strategies in production: compaction for conversations and dual memory systems for cross-session persistence
•Context is a finite resource with diminishing marginal returns; the goal is finding the smallest set of high-signal tokens maximizing desired outcomes
•Subagents isolate work in separate contexts and programmatic tool calling keeps large results out of the window entirely as complementary strategies
•Test clearing configs and compaction prompts against your workload's actual tool-use patterns to diagnose which context problem needs solving

Found this useful? Add it to a playbook for a step-by-step implementation guide.

Workflow Diagram

Start Process

Step A

Step B

Step C

Complete

Quality★★★★★

Concepts

Context Management Memory Systems Skills & Tools Agent Teams Tool Use

Artifacts (4)

Environment Setupconfig

ANTHROPIC_API_KEY=your-key-here

Python Dependencies Installationbashcommand

pip install anthropic python-dotenv matplotlib

Context Engineering Setup Scriptpythonscript

import json
import os
import tempfile
from collections import namedtuple
from pathlib import Path
import anthropic
import matplotlib.pyplot as plt
from dotenv import load_dotenv

load_dotenv()

if not os.environ.get("ANTHROPIC_API_KEY"):
    raise ValueError("ANTHROPIC_API_KEY not set. Add it to a .env file or export it.")

CORPUS_PATH = Path("research_corpus.py")
assert CORPUS_PATH.exists(), (
    f"research_corpus.py not found in {Path.cwd()}. It should be alongside this notebook."
)

client = anthropic.Anthropic()
MODEL = "claude-sonnet-4-6"

print(f"anthropic SDK {anthropic.__version__}, model {MODEL}")

# Token counting utility
_token_cache: dict[str, int] = {}

def count_tokens(text: str) -> int:
    if text not in _token_cache:
        _token_cache[text] = client.messages.count_tokens(
            model=MODEL,
            messages=[{"role": "user", "content": text}]
        ).input_tokens
    return _token_cache[text]

Corpus Analysis Scriptpythonscript

from research_corpus import COMPACTION_PROBES, CORPUS

print(f"CORPUS is a dict of {len(CORPUS)} synthetic documents held in Python memory.")
print("When the agent calls read_file, the content is served from this dict and")
print("lands directly in the agent's context window — no disk I/O involved.\n")

_total_tokens = 0
for path, content in CORPUS.items():
    n_tok = count_tokens(content)
    _total_tokens += n_tok
    display_name = path.removeprefix("/research/")
    print(f"{display_name:<26} ~ {n_tok:>6,} tokens")

print(f"\nTotal corpus: ~ {_total_tokens:,} tokens")

assert _total_tokens > 250_000, (
    f"Corpus is only {_total_tokens:,} tokens; expected >250K. "
    "Restart the kernel and re-run, or verify research_corpus.py is current."
)