Lab 6: Build a Web Search Tool

Section 4 Lab | Agent Engineering Duration: ~1.5 hours Prerequisites: Lab 4 (Extended Coding Agent) + Module 8 (Lectures 8.1–8.3)


Overview

Module 8 covered Retrieval-Augmented Generation as a category — the broad pattern of fetching external information and injecting it into the model's context for the current generation. Lecture 8.2 went deep on the vector-based implementation. Lecture 8.3 made the case that RAG is broader than vectors and that web search, database queries, and file reads are all valid retrieval mechanisms.

This lab gives your coding agent its first external-information capability that is not a file read: a web search tool backed by the Brave Search API. After this lab, your agent will be able to answer questions about current information, recent releases, and topics outside its training data.

The point is not to build something elaborate. The tool itself is small. What matters is the conceptual move: extending the agent with a retrieval mechanism that has nothing to do with vectors.


What You'll Produce

  1. Extended agent code (agent.py) — Your coding agent from Lab 4 with a new web_search tool registered alongside the existing tools
  2. Brave Search wrapper (web_search.py) — The function that calls the Brave API and returns results
  3. Reflection (reflection.md) — A short writeup answering the three questions in Part 4

Part 1: Set Up the Brave Search API (~15 minutes)

1.1 Get an API Key

Brave Search offers a free tier suitable for this lab.

  1. Go to https://api.search.brave.com/ and sign up for an account
  2. Subscribe to the Free plan (2,000 queries per month, 1 query per second)
  3. From the dashboard, create an API key. Copy it.

1.2 Store the Key

Add the key to your .env file at the project root:

BRAVE_API_KEY=your-key-here

Make sure .env is in your .gitignore — never commit API keys.

1.3 Quick Sanity Check

Verify the key works with a one-line curl test before integrating it into the agent:

curl -s "https://api.search.brave.com/res/v1/web/search?q=python+packaging" \
     -H "X-Subscription-Token: $BRAVE_API_KEY" \
     -H "Accept: application/json" | head -100

You should see a JSON response with a web.results array. If you get an authentication error, check the key. If you get a rate-limit error, wait a minute and retry.


Part 2: Build the Web Search Tool (~30 minutes)

2.1 The Wrapper Function

Create web_search.py alongside your agent.py. The wrapper handles the HTTP call, error handling, and result formatting.

web_search(query, count=5)

Requirements:

API details:

Output format suggestion:

[1] Python 3.13 Released — What's New
    https://docs.python.org/3.13/whatsnew/3.13.html
    Python 3.13.0 was released on October 7, 2024, with a free-threaded build,
    a new interactive interpreter, and improvements to typing...

[2] Python 3.13 release notes
    https://www.python.org/downloads/release/python-3130/
    Major new features: PEP 703 (Making the Global Interpreter Lock optional),
    PEP 744 (JIT Compilation), improved error messages...

[3] ...

Numbered, with the title on its own line, the URL on the second line, and the snippet wrapped to a reasonable width on subsequent lines. The model handles this format well and can cite specific results.

2.2 Edge Cases

Your tool should handle:

2.3 Standalone Test

Add an if __name__ == "__main__": block so the tool runs standalone:

python3 web_search.py

It should perform one search (e.g., "latest Anthropic Claude release") and print the formatted results. This lets you verify the tool independently before wiring it into the agent.


Part 3: Add the Tool to the Agent (~20 minutes)

3.1 Tool Schema

Define the tool schema for the Anthropic API in agent.py. Add it to your existing TOOLS list:

{
    "name": "web_search",
    "description": (
        "Search the public web for current information using the Brave "
        "Search API. Use this when the user asks about recent events, "
        "current versions of software, news, or any topic where the "
        "answer might not be in your training data. Do not use for "
        "questions about the user's local files — use read_file for that."
    ),
    "input_schema": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query. Phrase as you would in a search engine."
            },
            "count": {
                "type": "integer",
                "description": "Number of results to return (1-10). Default 5.",
                "default": 5
            }
        },
        "required": ["query"]
    }
}

The description matters. The model uses it to decide when to call the tool. Two key phrases: "current information" (so the model picks this for time-sensitive questions) and the explicit contrast with read_file (so the model doesn't reach for web search when the user is asking about local code).

3.2 Dispatch

Add web_search to your tool dispatch (registry from Lecture 6.3 or the if/elif from earlier lectures, whichever your agent uses):

from web_search import web_search

# In your tool registry or dispatch function:
"web_search": web_search,

3.3 System Prompt Update

Add a brief mention of the new capability to your agent's system prompt:

## Available Tools
...
**web_search(query, count=5)**: Search the public web for current
information. Use this when the user asks about recent events, current
versions, news, or anything that might not be in your training data.
Each result includes a title, URL, and snippet.

If your system prompt has a "How to Work" section, add guidance like:

- For questions about current information (today's news, recent
  releases, current best practices), use web_search rather than
  answering from training data.
- When you use web search, mention the source URLs in your reply
  so the user can verify.

The "mention the source URLs" instruction encourages source attribution, which is one of the practical advantages of RAG over training-data-only answers.


Part 4: Test and Reflect (~25 minutes)

4.1 Test Queries

Run the agent with each of the following queries. For each one, observe what the agent does and write it down.

  1. "What is the latest version of Python?" — Time-sensitive. The model's training data may have an out-of-date answer. Does the agent use web_search? Does it cite the source?

  2. "List the files in this directory." — Local. The agent should NOT use web_search here — it should use list_files. This tests whether the tool descriptions correctly steer the model.

  3. "What were the major announcements at the most recent Anthropic event?" — Recent and specific. Should pull from the web. Look at how the agent phrases its query.

  4. "What is the time complexity of binary search?" — General knowledge, well-covered in training data. The agent should answer directly without web search. If it does call web_search, that suggests your tool description is over-broad.

  5. "What's the current weather in Mahwah, New Jersey?" — Tests something Brave may or may not handle well. Snippet results probably show a weather widget summary. Does the agent extract a useful answer from the snippet?

For each query, note:

4.2 Reflection

Write reflection.md answering these three questions. Two-to-three sentences each is enough.

  1. In which of your test queries did the agent correctly choose between web_search and other tools? In which did it choose poorly? What about your tool descriptions or system prompt could you change to improve those decisions?

  2. Lecture 8.3 framed retrieval as a category — vector search, web search, database queries, file reads, and parametric "retrieval" from the model itself are all RAG. After implementing the web search tool, where do you see the conceptual overlap with the other tools your agent already has (read_file, search_file)? What do they have in common as retrieval mechanisms?

  3. What kind of agent application would benefit most from web search as its primary retrieval mechanism, vs. one that would be better served by vector RAG over a private corpus? Give one concrete example of each.


Submission

Submit:


Grading Rubric

Component Weight Criteria
Tool implementation 30% web_search works, handles errors and edge cases (no results, rate limits, missing key, special characters), formats results clearly
Agent integration 25% Tool schema is well-written, dispatch works, system prompt update is clear and steers the model correctly
Test coverage 20% All five test queries attempted and observations recorded
Reflection quality 25% Three answers are specific, draw on actual lab observations, and demonstrate understanding of the broader retrieval framing from Lecture 8.3

Notes and Tips