Implementing Compaction
Lecture 7.3
- Detecting the threshold: when to trigger compaction (e.g., 80% of context budget)
- The compaction prompt as the central engineering artifact
- Wiring compaction into the agent loop at the same hook point as truncation
- The hybrid pattern: summary + last K verbatim messages for safety
- Use a cheap model (Haiku) for summarization; reserve the main model for the work
- Edge cases: oversized summaries (iterative compression), critical detail loss (hybrid), compaction loops (minimum interval)
- Cost analysis: one extra Haiku call vs. tokens saved on every subsequent main-model call
Additional Resources