How LLMs Actually Work

Module 2, Lecture 2.1 | Introduction to Agentic Systems

This lecture builds the mental model every agent developer needs: how LLMs process input, generate output, and why certain design decisions matter. It covers neural networks and the transformer architecture at a high level, explains next-token prediction as the single fundamental operation of all LLMs, introduces attention as the mechanism that lets models focus on relevant parts of their input, and frames the context window as the agent developer's primary design space. The core takeaway: LLMs are stateless functions, and the agent engineer's job is to assemble the right context on every call.

Read the full lecture narrative

Additional Resources