Token-Efficient Tool Design

Module 4, Lecture 4.3 | Section 2: Working with LLMs in Practice

Tool results account for 50–70% of tokens in a typical agent session — and unlike system prompts or user messages, tool design is entirely under the engineer's control. This lecture shifts from reactive context management to proactive prevention: how to design tools that return the minimum the model needs, rather than everything available. The core pattern is progressive disclosure — metadata first, specific content on demand — illustrated through a concrete before/after redesign of a naive read_file tool. Pagination and summarized results extend the same principle to search tools and command output.

Read the full lecture narrative

Additional Resources

Lecture slides
Tool Use with Claude — Anthropic Docs — How to define and use tools with the Claude API, including token cost breakdowns for tool definitions and the tool result lifecycle.
Token Counting — Anthropic Docs — Pre-call token estimation for messages that include tool definitions and tool results.
Messages API Reference — Anthropic Docs — Full reference for the tool_use and tool_result content block schemas that carry tool results in context.
Effective Context Engineering for AI Agents — Anthropic Engineering Blog — Anthropic's own post on context as a finite resource, with recommendations for token-efficient tool design, progressive disclosure, and just-in-time retrieval.
Function Calling — OpenAI API — OpenAI's counterpart documentation, with a "Best Practices for Defining Functions" section showing that token-aware tool design is a cross-platform concern.
Solving Context Window Overflow in AI Agents (arXiv, 2025) — Research paper proposing external storage and memory pointers for large tool results, achieving ~7x token reduction — academic grounding for the progressive disclosure approach.