Agent Engineering

Section 1: Introduction to Agentic Systems

Module 1: Course Overview and The Agent Paradigm

Lecture 1.1: What is an Agent?

Definition: Software that perceives, reasons, and acts autonomously

Lecture 1.2: Human-Agent Engineering

"Human-agent engineering, not vibe coding"

Lecture 1.3: Course Philosophy and Roadmap

Core principle: Build it yourself first, then reach for frameworks

Module 2: Foundations - How LLMs Work

Lecture 2.1: How LLMs Actually Work

Neural networks and transformers (5-min flyover, refer to Course Resources for depth)

Lecture 2.2: How LLMs Are Trained

Three-stage pipeline: pre-training, instruction tuning, RLHF

Lecture 2.3: From Language Models to Agents

What LLMs can't do: no current info, no world interaction, unreliable math, no self-verification, no memory

Lab 1: Environment Setup and First API Calls

Section 2: Working with LLMs in Practice

Module 3: The LLM API and Generation

Lecture 3.1: Working with the API

Anatomy of an API call: messages, model, parameters, response

Lecture 3.2: The Model Landscape

Providers, brands, and models: the three-layer hierarchy (Anthropic → Claude → Sonnet)

Lecture 3.3: Controlling Generation

Temperature: controlling randomness (0.0 = deterministic, 1.0 = creative)

Lecture 3.4: In-Context Learning and the Limits of Prompting

In-context learning: the model "learns" from examples in the prompt without weight updates

Module 4: Context Management in Practice

Lecture 4.1: Measuring and Managing Context

Token counting in practice: tracking usage per API call

Lecture 4.2: Context Management Strategies

Strategy 1: Sliding window — keep last N messages, drop oldest

Lecture 4.3: Token-Efficient Tool Design

Every tool result enters the context — design tools with this in mind

Lab 2: API Exploration and Context Experiments

Section 3: Prompt and Context Engineering

Module 5: Prompt Engineering for Agents

Lecture 5.1: Principles and Anti-Patterns

Clarity and specificity: say exactly what you want — vague prompts produce vague results

Lecture 5.2: Prompt Structure and Techniques

XML tags for clear delineation: ``, ``, ``

Lecture 5.3: Structured Outputs for Tool Calling

Why agents need structured output: parsing tool calls from free-form text is fragile

Module 6: Building the Agent System Prompt

Lecture 6.1: System Prompt Architecture

Review: what system prompts do (identity, tools, behavior, constraints — from Lecture 2.3)

Lecture 6.2: Building Our Coding Agent System Prompt

Defining the agent's identity and role

Lecture 6.3: Testing and Iterating on Prompts

Prompts are code: version them, test them, review them

Lab 3: Prompt Engineering Workshop

Section 4: Building a Coding Agent from Scratch

Module 7: Implementing the Agent Loop and Tools

Lecture 7.1: From Concept to Code

Quick recap: the agent loop (from Lecture 2.3) and the system prompt (from Module 6)

Lecture 7.2: Implementing the Three Tools

`read_file(filename)` → returns file contents with error handling

Lecture 7.3: Tool Registry and Response Parsing

The tool registry pattern: name → function mapping, auto-generated descriptions

Module 8: Running and Extending the Agent

Lecture 8.1: Live Implementation Walkthrough

Live coding: imports, client setup, tool implementations

Lecture 8.2: Multi-Step Tasks and Emergent Behavior

Demo: "Read hello.py and add a multiply function" — requires read, then edit

Lecture 8.3: From 200 Lines to Production

What our agent can't do yet — and what production agents add

Lab 4: Build Your Coding Agent (CENTRAL LAB)

Section 5: Context Engineering Deep Dive

Module 9: Implementing Context Management

Lecture 9.1: Implementing Sliding Window and Selective Preservation

Taking the strategies from Module 4 and writing real code

Lecture 9.2: Implementing Compaction

The compaction prompt: what to tell the LLM to preserve vs. discard

Lecture 9.3: Progressive Disclosure

Don't load everything upfront — let agents discover context through exploration

Module 10: Advanced Context Engineering

Lecture 10.1: Tool Result Management

Tool results are the biggest context consumers in agent loops

Lecture 10.2: Context-Aware System Prompts

Static system prompts waste tokens on instructions that aren't relevant right now

Lecture 10.3: Putting It All Together

Combining strategies: sliding window + compaction + progressive disclosure + tool result management

Lab 5: Context Optimization for Your Coding Agent

Section 6: RAG - Building from Scratch

Module 11: RAG Fundamentals - DIY Approach

Lecture 11.1: RAG as Context Engineering

Problem: You can't fit all knowledge in the context window

Lecture 11.2: Document Chunking - DIY

Why chunk? Large documents don't fit; retrieval needs granularity

Lecture 11.3: Embeddings and Similarity - DIY

What is an embedding? Vector representation of meaning

Module 12: Building a Complete RAG System

Lecture 12.1: The Complete RAG Pipeline

Ingestion phase: load → chunk → embed → store

Lecture 12.2: Integrating RAG with Your Agent

Creating a `search_docs(query)` tool

Lecture 12.3: Now Introduce Vector Databases

You've built it---now understand what libraries add

Lab 6: Build RAG from Scratch, Then Use a Library

Section 7: Memory Systems - DIY First

Module 13: Conversation Memory - Building from Scratch

Lecture 13.1: The Memory Problem

Agents have no memory between sessions (stateless)

Lecture 13.2: Conversation Summarization - DIY

Strategy: Periodically summarize and compress

Lecture 13.3: Structured Note-Taking

Pattern: Agent maintains a NOTES.md or TODO.md file

Module 14: Long-Term Memory and Memory Frameworks

Lecture 14.1: Episodic Memory - DIY

Episodic: what happened, when, in what context

Lecture 14.2: Semantic Memory and Retrieval

Semantic: facts about the world (vs. personal experiences)

Lecture 14.3: Memory Frameworks - Now Introduce mem0

You've built it---now understand what mem0 adds

Lab 7: Add Memory to Your Coding Agent

Section 8: Agent Skills Architecture

Module 15: Understanding Agent Skills

Lecture 15.1: The Problem Skills Solve

Agents need domain expertise, but context is limited

Lecture 15.2: Progressive Disclosure in Practice

Level 1: Metadata (~50 tokens) - name + description at startup

Lecture 15.3: Skills with Executable Code

Some tasks are better done by code than LLM

Module 16: Building Your Own Skills System

Lecture 16.1: Skills System Architecture

Skill discovery: scan directories at startup

Lecture 16.2: Implementation Walkthrough

Skills directory structure: `/skills/{skill-name}/SKILL.md`

Lecture 16.3: Creating Skills for Your Agent

Example skills for a coding agent:

Lab 8: Implement Skills for Your Coding Agent

Section 9: Building Your Own Agent Framework

Module 17: Framework Design Patterns

Lecture 17.1: What We've Built - Taking Stock

Section 4: Basic agent loop with tools

Lecture 17.2: Framework Architecture

Core abstractions: Agent, Tool, Memory, Skill

Lecture 17.3: Design Decisions

State management: conversation, working memory, long-term

Module 18: Framework Implementation

Lecture 18.1: Core Agent Class

Constructor: model, tools, memory, skills, config

Lecture 18.2: Tool and Memory Abstractions

Base Tool class: name, description, schema, execute()

Lecture 18.3: Using Your Framework

Instantiating an agent with configuration

Lab 9: Build Your Agent Framework

Section 10: Existing Frameworks - Critical Evaluation

Module 19: LangChain Deep Dive

Lecture 19.1: LangChain Overview

History and evolution of LangChain

Lecture 19.2: Mapping to Your Framework

Your Agent ↔ LangChain Agent/AgentExecutor

Lecture 19.3: What LangChain Adds

Adds: pre-built integrations, LangSmith observability, community

Module 20: CrewAI and Alternative Frameworks

Lecture 20.1: CrewAI - Role-Based Multi-Agent

Philosophy: agents as team members with roles

Lecture 20.2: LangGraph and Other Options

LangGraph: agents as graphs with state

Lecture 20.3: Framework Selection Criteria

Task complexity: simple → your framework, complex → LangGraph

Lab 10: Port Your Agent to LangChain

Section 11: Multi-Agent Patterns and Autonomy

Module 21: Multi-Agent Architectures

Lecture 21.1: When Multiple Agents Outperform One

The problem: single agents hit context and complexity limits

Lecture 21.2: Multi-Agent Patterns

Lead agent + specialized sub-agents pattern

Lecture 21.3: Communication and Coordination

Message passing between agents

Module 22: Autonomy Management and Human-Agent Collaboration

Lecture 22.1: The Autonomy Spectrum

Levels: copilot → assistant → supervised autonomous → fully autonomous

Lecture 22.2: Human-Agent Collaboration Patterns

"Human-agent engineering"---developer as manager of AI "interns"

Lecture 22.3: Designing for Trust

Transparency: agent explains its reasoning

Lab 11: Build a Multi-Agent System

Section 12: Guardrails and Safety

Module 23: Building Guardrails from Scratch

Lecture 23.1: Why Guardrails Matter

Agents take real actions with real consequences

Lecture 23.2: Input Guardrails - DIY

Prompt injection detection: pattern matching, classification

Lecture 23.3: Output and Action Guardrails - DIY

Output filtering: detect harmful, inappropriate content

Module 24: Ethics and Adversarial Testing

Lecture 24.1: Ethical Frameworks for Agents

Privacy: what data should agents access and store?

Lecture 24.2: Adversarial Testing

Red teaming: deliberately trying to break your agent

Lecture 24.3: Safety Checklist and Existing Solutions

Production safety checklist:

Lab 12: Implement Guardrails for Your Agent

Section 13: Practical Agent Applications

Module 25: Chatbot and Assistant Patterns

Lecture 25.1: Conversational Agent Design

Beyond Q&A: maintaining conversation state and flow

Lecture 25.2: Domain-Specific Assistants

Tutoring agents: Socratic method, adaptive difficulty

Lecture 25.3: Retrieval-Augmented Chatbots

Combining conversation with knowledge retrieval

Module 26: Coding and Research Assistants

Lecture 26.1: Code Review Agents

Reading and understanding code (using your read_file tool)

Lecture 26.2: Research Agents

Multi-step research: search → read → summarize → synthesize

Lecture 26.3: Pipeline Agents

Combining patterns: research + write + review

Lab 13: Build a Practical Application

Section 14: Advanced Topics - MCP and Multi-Modal

Module 27: Model Context Protocol (MCP)

Lecture 27.1: MCP as Tool Standard

What is MCP? Standardized tool integration protocol

Lecture 27.2: MCP vs. Your Tool System

Your tools: custom, lean, fully controlled

Lecture 27.3: Building with MCP

Connecting to existing MCP servers

Module 28: Multi-Modal Agents

Lecture 28.1: Vision-Language Agents

Models with vision: GPT-4V, Claude 3 with vision

Lecture 28.2: Computer Use and Beyond

Anthropic's computer use capability

Lecture 28.3: Research Frontiers

World models and planning

Lab 14: Final Project Work Session