Tool Registry and Schema Generation

The agent built in Lectures 6.1 and 6.2 works correctly with three tools and an if/elif dispatch. The control flow is sound, the tools are safe, and the error handling is informative. What it is not is maintainable. As soon as a real agent grows past three tools, the dispatch function and the hand-written API schemas become a source of duplication, drift, and silent bugs. This lecture is a detour from agent behavior into software engineering: how to organize tool code so that adding a new tool means writing a single Python function and nothing else.

The fix is a tool registry — a dictionary mapping names to functions — combined with automatic schema generation that derives the API schema directly from the function's signature and docstring. Together these two changes eliminate every place where tool information was duplicated, replace the if/elif chain with a single dictionary lookup, and create a centralized dispatch point that becomes the natural home for validation, logging, and error wrapping.

The Scaling Problem

The dispatch function from Lecture 6.2 was a chain of if/elif branches:

def dispatch_tool(name, inputs):
    try:
        if name == "list_files":
            return list_files(**inputs)
        elif name == "read_file":
            return read_file(**inputs)
        elif name == "edit_file":
            return edit_file(**inputs)
        else:
            return f"Error: unknown tool: {name}"
    except ValueError as e:
        return str(e)
    except Exception as e:
        return f"Error: {type(e).__name__}: {e}"

This works for three tools. It will not work for thirty. There are three concrete problems:

  1. Information is duplicated. Every tool name appears in three places — the function definition, the hand-written API schema, and the if/elif chain. Adding a tool means editing all three, and forgetting one means the tool silently doesn't work.
  2. The schemas are maintained by hand. Each tool's name, description, and input_schema is a hand-written dict that drifts from the actual function signature whenever the function changes.
  3. There is no single source of truth. The function's docstring, the API schema's description, and the parameter names in the schema can all disagree with one another. When they do, which one is correct?

The deeper issue is that one part of the code is acting as documentation for another part of the code. Whenever this happens, you are guaranteeing that the two will eventually fall out of sync. The solution is to make the function itself the only place where tool information lives, and to derive everything else from it programmatically.

Python Decorators in Sixty Seconds

The mechanism that makes this clean is the Python decorator — the @ symbol that appears above a function definition. Many students in this course are new to Python, and decorators are foundational background that will reappear when we get to LangChain (@tool, @chain) and other agent frameworks. A working mental model now will save confusion later.

The starting point is a Python feature you may not have used in other languages: functions are first-class values. A function in Python is just an object, like a string or a list. You can store one in a variable, pass it as an argument to another function, return it from a function, and put it in a dictionary. None of this is special syntax — it follows directly from "everything is an object."

def greet(name):
    return f"Hello, {name}"

# A function is a value
say_hi = greet           # 'say_hi' now refers to the same function
say_hi("Ada")            # → "Hello, Ada"

# It can go into a dict
funcs = {"greet": greet}
funcs["greet"]("Ada")    # → "Hello, Ada"

A decorator is just a function that takes a function as its argument and returns a function. Nothing more. The decorator syntax:

@register_tool
def list_files(path: str):
    ...

is exactly equivalent to:

def list_files(path: str):
    ...

list_files = register_tool(list_files)

The decorator runs once, at import time, when Python is reading the module top-to-bottom. It receives the original function as its input, does something with it (in our case, stores it in a dictionary), and returns a function (in our case, the original, unchanged).

If you have used Flask or FastAPI, you have already seen this pattern. @app.route("/path") registers a URL handler the same way @register_tool registers a tool handler. The only thing the decorator pattern adds to plain Python is a tidier syntax for "do something with this function as soon as it's defined." If you wrote list_files = register_tool(list_files) after every tool definition, the program would behave identically — the decorator just keeps the registration declaration next to the function it applies to.

Decorators that take their own arguments (@register_tool(name="search")) are slightly more involved — they are functions that return a decorator — but we will not need that form here. The plain @register_tool is enough.

The Registry Pattern

With decorators in hand, the registry is just a dictionary and a one-line decorator function:

TOOL_REGISTRY = {}

def register_tool(func):
    """Register a function as a tool."""
    TOOL_REGISTRY[func.__name__] = func
    return func

Every Python function exposes a __name__ attribute that holds the string name of the function. The decorator uses this name as the dictionary key and stores the function itself as the value. The function is returned unchanged, so calling it works exactly as before — the only side effect is that it has been added to the registry.

Decorators run at import time, not at call time. By the time main() executes, every decorated tool has already been added to TOOL_REGISTRY:

TOOL_REGISTRY = {}                       # 1. Empty dict created

def register_tool(func):
    TOOL_REGISTRY[func.__name__] = func  # called once per @ below
    return func

@register_tool                           # 2. runs immediately
def list_files(path: str): ...           # registry: {'list_files': ...}

@register_tool                           # 3. runs immediately
def read_file(filename: str): ...        # registry: {..., 'read_file': ...}

@register_tool                           # 4. runs immediately
def edit_file(path: str, ...): ...       # registry: {..., 'edit_file': ...}

No explicit registration step is needed. As Python parses the file, the decorators are eagerly evaluated and the dictionary is built. By the time the agent makes its first API call, the registry already contains every tool.

The New Dispatch

With the registry in place, dispatch collapses to a single dictionary lookup:

def dispatch_tool(name, inputs):
    if name not in TOOL_REGISTRY:
        available = ", ".join(TOOL_REGISTRY.keys())
        return f"Error: unknown tool '{name}'. Available: {available}"
    try:
        return TOOL_REGISTRY[name](**inputs)
    except ValueError as e:
        return str(e)
    except Exception as e:
        return f"Error: {type(e).__name__}: {e}"

There is no if/elif. The registry is the dispatch. Adding a new tool requires no changes to this function. The lookup TOOL_REGISTRY[name] returns the function object, and (**inputs) calls it with the model's arguments. The behavior is identical to the previous dispatch — just without the duplication.

The unknown-tool error includes the list of available tools by name. This is deliberate: the error string is communication with the model, not just a log line. If the model hallucinates a tool name, listing the real ones gives it enough information to self-correct on the next iteration. Vague errors waste tool calls; specific errors enable retries.

Auto-Generating Tool Schemas

The Anthropic API requires every tool to be declared with a JSON object: a name, a description, and an input_schema describing the arguments and which ones are required.

{
  "name": "read_file",
  "description": "Read the contents of a file.",
  "input_schema": {
    "type": "object",
    "properties": {
      "filename": {
        "type": "string",
        "description": "Path to the file."
      }
    },
    "required": ["filename"]
  }
}

Every field of this schema corresponds to something the Python function already declares:

Schema field Python source
name func.__name__
description first line of func.__doc__
properties inspect.signature(func)
required parameters without defaults

If we structure the function carefully — typed parameters, a Google-style docstring with an Args: section — we can derive the entire schema by reflection. The function becomes the single source of truth, and the schema is regenerated every time it is needed.

Docstrings as Two-Part Documentation

A docstring in Python is the string literal that appears as the first statement inside a function, class, or module body. By convention it is written with triple quotes ("""...""") so it can span multiple lines. Python stores it on the function itself as func.__doc__, where any other code can read it back at runtime.

def read_file(filename: str):
    """Read the contents of a file.

    Args:
        filename: Path to the file.
    """
    ...

print(read_file.__doc__)   # the full docstring as a string

This is just a regular Python feature — nothing about it is specific to tool building. What makes it useful here is that we can structure the text so it carries two pieces of information that map directly to two parts of the schema:

The format we use is Google-style docstrings — a widely adopted convention that distinguishes section headers (Args:, Returns:, Raises:) and uses name: description for each item. The same convention is used by Sphinx (with the napoleon extension), most large open-source Python codebases, and many editor tools. Students writing tools just follow it; the schema generator parses it. The docstring is doing double duty — it is normal Python documentation that any reader can understand, and it is the source of the API descriptions the LLM will see.

The parser we'll write handles the simple case. A production framework would use docstring_parser or a similar library to handle every format variation correctly. The point is the principle: structured comments are extractable. As long as you write your docstrings consistently, code can read them.

A Short Tour of the inspect Module

Python ships with a standard-library module called inspect whose job is to let code examine other code at runtime. This is what makes auto-generated schemas possible — without inspect, we would have no way to ask a function what its parameters are.

The two pieces we use:

A short demonstration:

import inspect

def read_file(filename: str, encoding: str = "utf-8"):
    ...

sig = inspect.signature(read_file)

for name, param in sig.parameters.items():
    print(name, param.annotation, param.default)
# filename <class 'str'> <class 'inspect._empty'>
# encoding <class 'str'> utf-8

encoding has a default ("utf-8"), so it is optional. filename does not, so it is required. This is exactly the distinction the JSON Schema's required list cares about, which is why if param.default is inspect.Parameter.empty is the test we use to decide whether a parameter goes into the required list.

Python type annotations like filename: str are also runtime-readable through param.annotation. They are not enforced at runtime — Python itself does not check types — but inspect lets us read them and decide what to do with them. This is what makes python_type_to_json possible: we read the annotation as a Python type object and look it up in our mapping.

import inspect

def generate_tool_schema(func):
    sig = inspect.signature(func)
    properties = {}
    required = []

    for param_name, param in sig.parameters.items():
        prop = {"type": python_type_to_json(param.annotation)}
        desc = extract_param_description(func, param_name)
        if desc:
            prop["description"] = desc
        properties[param_name] = prop

        if param.default is inspect.Parameter.empty:
            required.append(param_name)

    return {
        "name": func.__name__,
        "description": (func.__doc__ or "").strip().split("\n")[0],
        "input_schema": {
            "type": "object",
            "properties": properties,
            "required": required
        }
    }

The loop visits each parameter, reads its type annotation, parses its description out of the docstring, and decides whether it is required based on whether it has a default value. The result is a fully formed schema dict with no manual maintenance.

A small helper maps Python types to JSON Schema types:

def python_type_to_json(python_type):
    type_map = {
        str: "string",
        int: "integer",
        float: "number",
        bool: "boolean",
    }
    return type_map.get(python_type, "string")

This is intentionally simple. JSON Schema supports arrays, nested objects, enums, and other complex types — production frameworks like pydantic handle all of them — but for the tools an agent typically needs, string, integer, number, and boolean cover the vast majority of cases. If a tool needs something more elaborate, the auto-generated schema can be overridden.

A second helper parses parameter descriptions out of the Args: block of the docstring. This is a lightweight, hand-written parser rather than a full docstring library — sufficient to demonstrate the principle that the function is the single source of truth.

Putting It Together

A get_tool_schemas function generates the schemas for every registered tool, and the result is passed straight into the API call:

def get_tool_schemas():
    """Generate API-ready schemas for all registered tools."""
    return [generate_tool_schema(func) for func in TOOL_REGISTRY.values()]

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    system=SYSTEM_PROMPT,
    tools=get_tool_schemas(),
    messages=messages
)

Adding a new tool to the agent now means writing a single decorated function with a typed signature and a docstring. No JSON schema to author by hand, no dispatch to update, no registry entry to add. The before/after contrast is the payoff:

Before adding a tool After adding a tool
1. Write the function 1. Write the function with @register_tool
2. Write the JSON schema by hand 2. Done
3. Add a branch to the if/elif dispatch
4. Hope all three agree

System Prompt vs. API Schema

A reasonable question at this point is whether the system prompt's tool descriptions could be auto-generated too. The answer is yes in principle but no in practice — they serve different purposes.

System Prompt API Tool Schema
Purpose Behavioral guidance Structured invocation
Content When to use, workflow rules Name, args, types
Example "Always read before editing" {"name": "read_file", ...}

The API schema tells the model how to call a tool — what arguments it takes, what types they are, which are required. The system prompt tells the model when and why to use it — workflow rules, ordering constraints, behavioral guidance like "always read before editing" or "ask for confirmation before deleting." Those rules don't belong in the API schema.

The auto-generation we just built keeps the API schema in sync with the code automatically. The system prompt's tool descriptions still need to be updated by hand when behavior changes. This coupling is a design feature, not a limitation — keeping the two surfaces separate lets each one say what it is best at saying.

There is also a practical reason to keep them separate: the Anthropic API caches the tools parameter and the system prompt independently. Splitting them lets each one be cached effectively across calls.

Handling Malformed Tool Calls

The Anthropic API validates tool arguments against the input_schema, so type mismatches are caught before they reach our code. But several failure modes still slip through:

Failure Example Fix
Unknown tool name Model hallucinates a tool List available tools in the error
Semantically wrong args Empty filename, nonexistent path Tool-level validation
Extra arguments Model sends language="python" to read_file Filter to expected params
Missing optional args Model omits a parameter with a default Python handles this naturally

The mental shift required here is that the LLM is essentially writing code that calls our functions, and our job as agent developers is to verify that the code is correct. When you, the programmer, call a function incorrectly, your editor underlines it, the type checker complains, or the program crashes with a stack trace. The LLM gets none of that feedback. It writes a tool call as a JSON blob, sends it through the API, and waits for a result. If we crash, the agent dies. If we return a vague error, the model has nothing to retry with. The dispatch function is where we catch all of this and turn it into something the model can act on.

The Defensive Dispatch

def dispatch_tool(name, inputs):
    if name not in TOOL_REGISTRY:
        available = ", ".join(TOOL_REGISTRY.keys())
        return f"Error: unknown tool '{name}'. Available: {available}"

    func = TOOL_REGISTRY[name]

    # Filter to expected parameters only
    sig = inspect.signature(func)
    valid_params = set(sig.parameters.keys())
    filtered = {k: v for k, v in inputs.items() if k in valid_params}

    try:
        return func(**filtered)
    except TypeError as e:
        return f"Error calling {name}: {e}"
    except ValueError as e:
        return str(e)
    except Exception as e:
        return f"Error: {type(e).__name__}: {e}"

Three design choices in this dispatch are worth highlighting:

  1. Filter unexpected arguments rather than crashing. Models occasionally include parameters that aren't part of the function's signature — a language field on read_file, for example. The Python **filtered unpacking would raise TypeError on unknown keyword arguments. Inspecting the function's actual parameters and dropping the unknown ones is more robust than failing on a hallucinated extra.
  2. Return errors as strings. Every failure path produces a string the model can read. The agent loop is built around tools that return strings; an exception that escapes dispatch crashes the entire agent.
  3. Catch-all exception handler. The agent must never die from a tool error. Specific exceptions are caught for the cases where we have a useful message to give; the final except Exception ensures that nothing slips through and kills the loop.

Self-Correction in Action

Why does any of this matter? Because the model uses error messages to recover. When the model receives an error as a tool result, it typically reads it, adjusts its approach, and retries:

[1] assistant: [tool_use: edit_file(old_str="def add(x, y)", ...)]
[2] user:      [tool_result: "Error: text not found in utils.py"]
[3] assistant: [tool_use: read_file(filename="utils.py")]
[4] user:      [tool_result: "def add(a, b):\n    return a + b"]
[5] assistant: [tool_use: edit_file(old_str="def add(a, b)", ...)]
[6] user:      [tool_result: "Edited utils.py"]
[7] assistant: [text: "Done — updated the add function."]

The first edit failed because the model guessed wrong about the parameter names. The error told it the text wasn't found. It read the file, saw the actual signature, and retried with the correct string. Three iterations instead of two — but the task completed correctly without human intervention.

This self-correction behavior is emergent. It is not programmed into the agent. It comes from the way the model was trained: when an LLM sees an error message in its context, it tends to treat it as feedback and try something different. But this only works if the error messages are clear and informative. "Error: text not found in utils.py" is actionable. "Error: operation failed" is not. The principle from Lecture 6.2 holds: error handling is communication with the model. The registry pattern makes that communication centralized and consistent.

The Dispatch as a Control Point

The dispatch function is now the one place every tool call passes through. That makes it the natural home for cross-cutting concerns:

def dispatch_tool(name, inputs):
    print(f"  [tool] {name}({', '.join(f'{k}={v!r}' for k, v in inputs.items())})")

    # ... validation and execution ...

    preview = result[:120] + "..." if len(result) > 120 else result
    print(f"  [result] {preview}")

    return result

A handful of useful things to log:

Simple print calls are sufficient for now. Production agents use structured logging — JSON lines that can be queried and analyzed — but the principle is the same. The next lecture builds on this foundation by adding token counting and context-growth tracking, all hooked into the same dispatch function.

Common Questions

"Why not use Pydantic for schema generation?" You can, and production frameworks do. Libraries like pydantic and instructor generate JSON Schema from Python models with full validation. The inspect-based approach in this lecture exists to show the mechanism — once you understand what is happening under the hood, swapping in a library becomes a choice rather than a mystery. Module 12 returns to this when we look at LangChain.

"What if I want a tool name different from the function name?" Extend the decorator to take an optional name argument: @register_tool(name="search"). Store the custom name in the registry instead of func.__name__. This is a one-line change that doesn't affect anything else.

"Should I auto-generate the system prompt tool descriptions too?" You could, but you'd lose the ability to embed behavioral guidance like "always read before editing." That guidance doesn't belong in the API schema, and keeping the two surfaces separate is a feature.

"How do I test that my schemas are valid?" Call client.messages.create with the schemas and a simple prompt. The Anthropic API validates schemas on every call and returns a clear error immediately if something is malformed.

Key Takeaways

  1. The registry pattern eliminates duplication. One decorated function is the single source of truth for name, description, arguments, and implementation.
  2. Auto-generated schemas stay in sync. inspect.signature plus docstring parsing produces API-ready schemas with no manual JSON to maintain.
  3. Defensive dispatch is non-negotiable. Filter unexpected arguments, catch every exception, and never let the agent crash from a tool error.
  4. Error messages fuel self-correction. Clear errors let the model retry intelligently. Vague errors waste tool calls.
  5. Centralized dispatch is the right place for observability. Every tool call passes through one function — the natural location for logging, validation, and metrics, which Lecture 6.4 builds on.