How LLMs Are Trained

Module 2, Lecture 2.2 | Introduction to Agentic Systems

This lecture explains the three-stage training pipeline behind commercial LLMs — pre-training, instruction tuning, and RLHF — and shows how each stage produces specific, predictable behaviors. Pre-training on internet text explains dramatic prose and over-engineered code. Instruction tuning introduces verbosity and agreeableness. RLHF creates sycophancy, hedging, and confident errors. The lecture introduces a "training data detective" framework: when an LLM does something surprising, ask what was in the training data that would produce that behavior. This is a practical engineering skill for predicting LLM strengths and weaknesses, designing effective context, and debugging agent behavior.

Read the full lecture narrative

Additional Resources