class: center, middle, inverse count: false # The Agent Loop ??? ~20 minutes. Live coding lecture. Students build the agent loop from scratch with stub tools. Focus: two-loop architecture, stop_reason branching, messages array growth. --- # Two Loops, One Agent An agent handles two different kinds of work: 1. **Conversation** — getting input from the user, surfacing the final reply 2. **Tool execution** — calling tools, feeding results back, letting the model decide what's next These operate at different timescales. A single user message might trigger five tool calls internally. A single loop can't express both. .split-left[ **Outer loop** — one iteration per user message ``` get user input append to messages run inner loop print reply repeat ``` ] .split-right[ **Inner loop** — one iteration per API call ``` call API check stop_reason if tool_use → execute, loop if end_turn → surface reply, break ``` ]
??? 90 seconds. The key insight: these are separate concerns. The user sees one exchange; the model may have done five rounds of tool calling internally. --- # stop_reason Is the Entire Control Flow .split-left[ .center[
] ] .split-right[ Two states. One branch. That's the entire agent. - **`"tool_use"`** — the model needs more information. Execute the requested tool, send the result back. - **`"end_turn"`** — the model is done. Surface the text reply, return to the outer loop. .callout[Every agent built on the Anthropic API — from 200 lines to Claude Code — uses this exact branching structure.] ]
??? 90 seconds. Emphasize simplicity. The complexity lives in the tools, prompt, and context management — not in the loop itself. Image prompt for `agent-loop-state-machine.png`: "A state machine diagram with two states. Clean, minimal flat design. Top: rounded rectangle labeled 'Call API with messages array'. Arrow down to a diamond decision node labeled 'stop_reason?'. Two branches from the diamond: Left branch labeled 'end_turn' goes to a rounded rectangle 'Surface text reply → break to outer loop'. Right branch labeled 'tool_use' goes to a rounded rectangle 'Execute tools, append tool_result as role: user'. An arrow loops back from this rectangle up to the 'Call API' rectangle. Colors: teal/blue tones, white background, sans-serif labels. No decorative elements." --- class: center, middle # Let's revisit the agent loop code itself ??? Open agent.py and walk through the full implementation live. --- # Five Lines That Matter 1. **`messages.append({"role": "assistant", "content": response.content})`** The *full* content list goes in — tool_use blocks and all. The model needs to see its own requests on the next iteration. 2. **`if response.stop_reason == "tool_use"`** The only branch. Everything else follows from this. 3. **`"tool_use_id": block.id`** Ties each result to its request. The model may issue multiple tool calls per response. 4. **`messages.append({"role": "user", "content": tool_results})`** Tool results enter as **role: "user"**. They come from outside the model (the agent code ran them), so they re-enter from the user side. 5. **`break`** The only exit. The inner loop runs until `end_turn`. ??? 2 minutes. These five lines are the entire agent mechanism. Everything else is setup. --- # Tracing the Messages Array: Single Tool Call "Create hello.py with a hello world function" ``` [0] role: user "Create hello.py with a hello world function" [1] role: assistant [tool_use: edit_file(path="hello.py", old_str="", ...)] [2] role: user [tool_result: "Created hello.py"] [3] role: assistant [text: "Done — created hello.py."] ``` Inner loop ran **twice**: tool request → final reply. The user sees only `[0]` and `[3]`. Entries `[1]` and `[2]` are invisible — the conversation between the agent code and the model. ??? 60 seconds. Trace through the entries. Make the visible/invisible distinction clear. --- # Tracing: Multi-Tool Exchange "Add a main block to hello.py" — requires reading first: ``` [0] role: user "Add a main block to hello.py" [1] role: assistant [tool_use: read_file(filename="hello.py")] [2] role: user [tool_result: "def hello():\n print('Hello!')"] [3] role: assistant [tool_use: edit_file(path="hello.py", ...)] [4] role: user [tool_result: "Edited hello.py"] [5] role: assistant [text: "Done — added a main block."] ``` Inner loop ran **three times**: read → edit → reply. The model on iteration 3 can see the file contents (from `[2]`) and the edit confirmation (from `[4]`). ??? 60 seconds. The read-before-edit pattern from 5.3, now visible in the messages array. --- # Messages are the Memory .split-left[ Each API call sends the **complete** messages array. The model has no persistent memory — it reads the full array fresh every time. Every tool result stays in the array for every subsequent API call: - A `read_file` returning 2,000 tokens → those tokens are in *every future call* - Context grows monotonically - The cost per API call increases with every iteration .warning[This is the context growth problem from Module 4. Lecture 6.4 instruments it; Section 4 solves it.] ] .split-right[ .center[
] ]
??? 90 seconds. Connect to Module 4. This is the bridge — the loop creates the problem, context management solves it. Image prompt for `context-growth-sketch.png`: "A simple bar chart showing 7 vertical bars, labeled 'API Call 1' through 'API Call 7'. Each bar is taller than the previous one, showing monotonic growth. The bars are color-coded with two segments: a small constant blue segment at the bottom (labeled 'System prompt + schemas') and a growing teal segment on top (labeled 'Messages array'). The y-axis is labeled 'Input tokens'. Clean, minimal style, no grid lines, sans-serif font." --- # Parallel Tool Calls A single API response can contain multiple `tool_use` blocks: ``` [1] role: assistant [tool_use: read_file("utils.py"), tool_use: read_file("tests.py")] [2] role: user [tool_result id=tu_004: "...", tool_result id=tu_005: "..."] ``` The `for block in response.content` loop handles this naturally — it collects every `tool_use` block, not just the first. Both results re-enter as a single user turn. The `tool_use_id` ties each result to its request. ??? 60 seconds. Worth mentioning even though it's uncommon with three tools. Becomes common once the agent has more capabilities. --- # Key Takeaways 1. **Two loops** — outer manages conversation, inner manages tool execution 2. **`stop_reason`** — the only decision point in the inner loop 3. **Messages array** — shared state that grows monotonically; the model reads it fresh each call 4. **Tool results as `role: "user"`** — external information re-enters from the user side 5. **The loop is trivially simple** — the complexity lives elsewhere