Agent Engineering

This course covers the design, implementation, and deployment of AI-powered software agents. Starting from raw API calls, you'll progress through prompt engineering, context management, RAG, memory systems, and multi-agent architectures — building each component from scratch before reaching for frameworks.

This course is completely free. Each lesson (lecture) contains video, slides, a lecture narrative (which altogether starts to approximate a textbook), and supplemental resources. There are labs to work on as well. It's an adaptation of a course I teach at Ramapo College.

If you are looking for more help, either for yourself or your team - please feel free to contact me at [email protected]

Module 1: Course Overview and The Agent Paradigm

Lecture 1.1: What is an Agent?

Definition: Software that perceives, reasons, and acts autonomously

Lecture 1.2: Human-Agent Engineering

"Human-agent engineering, not vibe coding"

Lecture 1.3: Course Philosophy and Roadmap

Core principle: Build it yourself first, then reach for frameworks

Lab 1: Environment Setup and First API Calls

Module 2: Foundations - How LLMs Work

Lecture 2.1: How LLMs Actually Work

Neural networks and transformers (5-min flyover, refer to Course Resources for depth)

Lecture 2.2: How LLMs Are Trained

Three-stage pipeline: pre-training, instruction tuning, RLHF

Lecture 2.3: From Language Models to Agents

What LLMs can't do: no current info, no world interaction, unreliable math, no self-verification, no memory

Lab 2: API Exploration and Context Experiments

Module 3: The LLM API and Generation

Lecture 3.1: Working with the API

Anatomy of an API call: messages, model, parameters, response

Lecture 3.2: The Model Landscape

Providers, brands, and models: the three-layer hierarchy (Anthropic → Claude → Sonnet)

Lecture 3.3: Controlling Generation

Temperature: controlling randomness (0.0 = deterministic, 1.0 = creative)

Lecture 3.4: In-Context Learning and the Limits of Prompting

In-context learning: the model "learns" from examples in the prompt without weight updates

Module 4: Context Management in Practice

Lecture 4.1: Measuring and Managing Context

Token counting in practice: tracking usage per API call

Lecture 4.2: Context Management Strategies

Strategy 1: Sliding window — keep last N messages, drop oldest

Lecture 4.3: Token-Efficient Tool Design

Every tool result enters the context — design tools with this in mind

Module 5: Agent Prompting

Lecture 5.1: Prompting Principles and Techniques

Clarity and specificity: say exactly what you want — vague prompts produce vague results

Lecture 5.2: System Prompt Architecture

What system prompts do: identity, tools, behavior, constraints (from Lecture 2.3) — going deeper

Lecture 5.3: Building the Coding Agent System Prompt

Defining the agent's identity and role

Lab 3: The Booking Agent

Module 6: Building the Coding Agent

Lecture 6.1: The Agent Loop

Deep dive into the two-loop architecture introduced in Lecture 5.3

Lecture 6.2: Implementing Tools and Safety

Replacing stubs with real filesystem tools: list_files, read_file, edit_file

Lecture 6.3: Tool Registry and Schema Generation

The if/elif dispatch problem: doesn't scale, duplicates information

Lecture 6.4: Multi-Step Behavior, Instrumentation, and Streaming

Demo: multi-step tasks the agent chains autonomously (read → edit → verify)

Lab 4: Extend Your Coding Agent

Module 7: Implementing Context Management

Lecture 7.1: Implementing Sliding Window

Taking the sliding window strategy from Module 4 and writing real code

Lecture 7.2: Implementing Selective Preservation

When sliding window is too lossy: keep messages that matter regardless of age

Lecture 7.3: Implementing Compaction

Detecting the threshold: when to trigger compaction (e.g., 80% of context budget)

Lab 5: Context Optimization for Your Coding Agent

Module 8: Retrieval-Augmented Generation (RAG)

Lecture 8.1: RAG as Context Engineering

The problem: context windows are finite, knowledge isn't

Lecture 8.2: Embeddings, Similarity, and Vector Storage

The most common RAG implementation, treated as one specific approach

Lecture 8.3: Retrieval Beyond Vectors

RAG = Retrieval-Augmented Generation; vector search is one retrieval mechanism among many

Lab 6: Build a Web Search Tool

Module 9: Memory Systems

Lecture 9.1: The Memory Problem and Conversation Memory

Agents have no memory between sessions (stateless)

Lecture 9.2: Episodic Memory - DIY

Episodic: what happened, when, in what context

Lecture 9.3: Memory Consolidation — AutoDream and Durable Memory

The accumulation problem: memory quality degrades over time without curation (contradictions, stale references, duplicate entries, relative timestamps that lose meaning)

Lab 7: Add Memory to Your Coding Agent

Module 10: Understanding Agent Skills

Lecture 10.1: The Problem Skills Solve

Agents need domain expertise, but context is limited

Lecture 10.2: Progressive Disclosure in Practice

Level 1: Metadata (~50 tokens) - name + description at startup

Module 11: Building Your Own Skills System

Lecture 11.1: Skills System Implementation

Skill discovery: scan directories at startup

Lecture 11.2: Creating Skills for Your Agent

Example skills for a coding agent:

Lab 8: Implement Skills for Your Coding Agent

Module 12: Framework Design and LangChain

Lecture 12.1: Framework Architecture

Taking stock: agent loop, context management, RAG, memory, skills — what we've built

Lecture 12.2: Framework Design Decisions

State management: conversation, working memory, long-term

Lecture 12.3: LangChain: Overview, Mapping, and Trade-offs

History and evolution of LangChain

Lab 9: Build Your Agent in LangChain

Module 13: Multi-Agent Architectures and Programmatic Tool Calling

Lecture 13.1: Multi-Agent Architectures

The problem: single agents hit context and complexity limits

Lecture 13.2: Programmatic Tool Calling

The problem: tool results flood the context window, especially with bulk data

Lab 10: Build a Multi-Agent System

Module 14: Guardrails and Safety

Lecture 14.1: Why Guardrails Matter

Agents take real actions with real consequences

Lecture 14.2: Input Guardrails - DIY

Prompt injection detection: pattern matching, classification

Lecture 14.3: Output, Action Guardrails, and Safety Checklist

Output filtering: detect harmful, inappropriate content

Lab 11: Implement Guardrails for Your Agent

Module 15: Model Context Protocol (MCP)

Lecture 15.1: MCP: The Tool Standard and How It Compares

What is MCP? Standardized tool integration protocol

Lab 12: Build a Practical Application