Module 4, Lecture 4.1 | Section 2: Working with LLMs in Practice
Context windows have limits — but quality degrades before you hit them. This lecture makes context growth visible: how to count tokens per API call, where tokens actually accumulate during an agent task (spoiler: tool results dominate), and how to set a budget threshold that lets you act before the window fills. You'll also see empirically, through a live code demo, how a simple five-turn conversation can accumulate thousands of input tokens in minutes.
Read the full lecture narrative
usage object returned on every API response, with input_tokens and output_tokens fields.