Build Your First AI Agent — 搭建你的第一个 AI Agent

What Is an AI Agent? — 什么是 AI Agent？

🇨🇳 中文

一个 AI agent 不只是聊天机器人。聊天机器人只能 talk，而 agent 可以 使用工具 — 读你的文件、运行命令、写新代码。它就像有了双手的 assistant。

你给它一个目标，它会自己 decide 要调用哪个工具、什么时候调用、调用多少次，直到把事情做完。

🇬🇧 English

An AI agent is more than a chatbot. A chatbot can only talk. An agent can use tools — read your files, run commands, write new code. It's an assistant with hands.

You give it a goal. It decides which tool to call, when to call it, and how many times — until the job is done.

What You'll Build Today — 今天要做的

A working AI agent in about 70 lines of Python, with three tools: read a file, write a file, and run a shell command. It uses Google's free Gemini API, so it costs nothing to run.

大约 70 行 Python 代码就能跑起来一个真正的 agent，三个工具：读文件、写文件、运行命令。用谷歌免费的 Gemini API，完全不要钱。

Why Python? — 为什么用 Python？

🇨🇳 中文

Python 是 AI 世界的 standard programming language。原因有四个：

语法简单 — 像写英文句子，没有大量符号
免费 — Python 本身免费，所有 AI 库也免费
跨平台 — Windows、Mac、Linux、Chromebook 都能跑
所有 AI 公司都先支持 Python — OpenAI、Google、Anthropic、Meta

🇬🇧 English

Python is the standard language for AI work. Four reasons:

Simple syntax — reads like English, few symbols
Free — Python and every AI library are free
Cross-platform — runs on Windows, Mac, Linux, Chromebook
Every AI company ships Python first — OpenAI, Google, Anthropic, Meta

💡 Versions — 版本

English: Use Python 3.10 or newer. Python 2 is dead. Don't use it. The agent script in this lesson uses features added in 3.10.

中文： 使用 Python 3.10 或更新版本。Python 2 已经停止维护，不要用。本课的脚本使用了 3.10 的新特性。

Installation — 安装步骤

1 Install Python — 安装 Python

🇨🇳 中文

访问 python.org/downloads
点击大黄色按钮 "Download Python 3.x"
运行下载好的 installer
Windows 用户重要！ 安装时一定要勾选 "Add Python to PATH"
安装完成后，打开终端检查：

🇬🇧 English

Go to python.org/downloads
Click the big yellow "Download Python 3.x" button
Run the installer
Windows users — important! Check "Add Python to PATH" on the first install screen
After install, open a terminal and check:

Windows: PowerShell or cmd Mac/Linux: Terminal

# Should print 3.10 or higher / 应该显示 3.10 或更高

python --version

# Or on Mac/Linux / 或者在 Mac/Linux 上

python3 --version

2 Get a Free Gemini API Key — 获取免费的 Gemini API Key

🇨🇳 中文

访问 aistudio.google.com/apikey
用 Google account 登录（注册免费）
点击 "Create API key"（创建 API 密钥）
把生成的密钥 复制下来，保存好
不要把 API key 发到网上！它就像你的密码

🇬🇧 English

Go to aistudio.google.com/apikey
Sign in with a Google account (free to make)
Click "Create API key"
Copy the key and save it somewhere safe
Never post your API key online! It's like a password

💰 Cost — 费用

The free tier of Gemini gives you plenty of room to learn — far more than a class will use in a term. No credit card required.

Gemini 免费 tier 对学习来说完全够用，一个学期都用不完。不需要信用卡。

3 Install the Gemini Library — 安装 Gemini 的库

🇨🇳 中文

用 pip（Python 的 package manager）安装：

🇬🇧 English

Use pip (Python's package manager) to install the library:

pip install google-genai

# Mac/Linux may need: / Mac/Linux 上可能需要：

pip3 install google-genai

4 Set Your API Key as an Environment Variable — 设置环境变量

🇨🇳 中文

这一步告诉脚本你的密钥是什么，而不用把密钥写进代码里（这样更安全）：

🇬🇧 English

This tells the script your key without writing the key into the code (much safer):

Mac / Linux

export GEMINI_API_KEY=your_key_here

Windows (PowerShell)

$env:GEMINI_API_KEY="your_key_here"

Windows (cmd)

set GEMINI_API_KEY=your_key_here

⚠️ This is per-terminal — 这只对当前终端有效

If you close the terminal, you'll need to set the key again next time. To make it permanent, add the line to your shell profile (.bashrc, .zshrc) on Mac/Linux, or use System Environment Variables on Windows.

关闭终端后密钥就没了，下次要重新设。想长期保存，Mac/Linux 加到 .bashrc 或 .zshrc 文件里，Windows 用"系统环境变量"设置。

Download the Agent — 下载 Agent 脚本

🇨🇳 中文

整个 agent 只有一个 Python 文件，大约 70 行。下载后保存到任意 folder。

🇬🇧 English

The whole agent is one Python file, about 70 lines. Download it and save it anywhere you like.

📥 agent.py

One file. Three tools: read_file, write_file, run_shell.
一个文件，三个工具：读文件、写文件、运行命令。

⬇️ Download agent.py / 下载 agent.py

🇨🇳 中文

下载后，在 terminal 中进入文件所在文件夹，运行：

🇬🇧 English

After downloading, open a terminal in that folder and run:

cd path/to/your/folder

python agent.py

🇨🇳 中文

看到 Agent ready. Ask it something. 就说明启动成功了。直接 type 你的问题，按回车。

🇬🇧 English

When you see Agent ready. Ask it something., you're running. Type a question and press Enter.

How the Loop Works — 循环原理

🇨🇳 中文

这就是 agent 的 核心思想。它不是一次性的问答，而是一个 loop，不停地调工具、看结果、再决定下一步：

🇬🇧 English

This is the core idea. An agent isn't a one-shot Q&A — it's a loop that keeps calling tools, seeing the results, and deciding what to do next:

You type a request. 你输入一个请求。
"Read agent.py and add a comment at the top." / "读 agent.py 然后在顶部加一行注释。"

Model thinks. 模型思考。
It decides: "I need to call read_file first." / 它决定先调用 read_file。

Your code runs the tool. 你的代码运行工具。
The Python function runs. The result comes back as a string. / Python 函数运行，结果以字符串形式返回。

Result goes back to the model. 结果再交给模型。
"Here's the file content." / "这是文件内容。"

Model decides again. 模型再次决定。
"Now I need to call write_file with the new content." / "现在我要调用 write_file 写入新内容。"

Repeat until done. 一直循环直到完成。
When the model returns plain text instead of a tool call, the loop ends. / 当模型返回普通文字（不再调用工具）时，循环结束。

🎓 Pedagogical note — 教学注释

The Gemini SDK has a feature called "automatic function calling" that hides this loop from you. We turn it off on purpose (disable=True) so students can see the loop. The loop is the lesson.

Gemini SDK 有个"自动函数调用"功能，会把这个循环隐藏起来。我们故意把它关掉（disable=True），让学生能看见循环。循环本身就是这堂课的重点。

Try These Prompts — 试试这些指令

🇨🇳 中文

启动 agent 后，从简单的开始，逐步增加难度。每个任务都让你看到 agent 在做什么决定：

🇬🇧 English

Once the agent is running, start simple and work up. Each prompt shows you a different decision the agent has to make:

"What files are in this folder?"

这个文件夹里有什么文件？

Tests: run_shell with ls or dir / 测试：用 run_shell 调用 ls 或 dir

"Read agent.py and explain the loop in two sentences."

读 agent.py，用两句话解释那个循环。

Tests: read_file + reasoning / 测试：read_file + 推理

"Make a file called hello.txt that says hello in three languages."

写一个 hello.txt 文件，用三种语言说"你好"。

Tests: write_file / 测试：write_file

"Create a Python file called fizzbuzz.py that prints FizzBuzz from 1 to 30, then run it."

创建一个叫 fizzbuzz.py 的 Python 文件，打印 1 到 30 的 FizzBuzz，然后运行它。

Tests: write_file + run_shell / 测试：写文件 + 运行

"Read fizzbuzz.py, add a comment explaining the modulo trick, and save it back."

读 fizzbuzz.py，加一行注释解释取模运算的技巧，再保存回去。

Tests: full read → modify → write cycle / 测试：完整的"读 → 改 → 写"循环

You will see real decisions. — 你会看到真正的决策。

For each task, the terminal will print every tool call before it runs. Watch the agent think, ask permission, and act.

每个任务里，终端都会在工具运行之前打印它的调用。看着 agent 思考、申请权限、然后动手。

Safety — 安全注意

🛑 Why we ask permission — 为什么要询问

English: The write_file and run_shell tools both ask "allow? [y/N]" before doing anything. This is the most important safety rule when building agents: the model proposes, the human approves. If a model goes wrong, the worst it can do is suggest a bad command — you say no.

中文： write_file 和 run_shell 这两个工具在动手之前都会问 "allow? [y/N]"。这是搭建 agent 时最重要的安全规则：模型提出建议，人来批准。即使模型出错，最多也只是提出一个糟糕的命令 — 你说"不"就行了。

🇨🇳 中文 — 给学生的几条规则

看清楚每个命令再说 y
不要让 agent 操作 系统文件（/etc、C:\Windows）
不要让它运行需要 管理员权限 的命令（sudo、rm -rf）
API key 永远不要 commit 到 GitHub
每一次回话都要重新设环境变量（除非你设成了永久的）

🇬🇧 English — A few rules for students

Read every command before typing y
Don't let the agent touch system files (/etc, C:\Windows)
Don't let it run anything needing admin rights (sudo, rm -rf)
Never commit your API key to GitHub
You'll set the env var fresh each session (unless you made it permanent)

This is how real AI agents work. — 真正的 AI agent 就是这样工作的。

The agents inside Cursor, Claude Code, GitHub Copilot, and ChatGPT all run the same loop you just built. Different tools, more polish — but the same core idea.

Cursor、Claude Code、GitHub Copilot、ChatGPT 这些产品里的 agent，跑的就是 你刚才搭的同一个循环。工具更多、外观更精美，但核心原理一模一样。