What Is an AI Agent? — 什么是 AI Agent?
🇨🇳 中文
一个
你给它一个目标,它会自己
🇬🇧 English
An AI agent is more than a chatbot. A chatbot can only talk. An agent can use tools — read your files, run commands, write new code. It's an assistant with hands.
You give it a goal. It decides which tool to call, when to call it, and how many times — until the job is done.
What You'll Build Today — 今天要做的
A working AI agent in about 70 lines of Python, with three tools: read a file, write a file, and run a shell command. It uses Google's free Gemini API, so it costs nothing to run.
大约 70 行 Python 代码就能跑起来一个真正的 agent,三个工具:读文件、写文件、运行命令。用谷歌免费的 Gemini API,完全不要钱。
Why Python? — 为什么用 Python?
🇨🇳 中文
Python 是 AI 世界的
- 语法简单 — 像写英文句子,没有大量符号
- 免费 — Python 本身免费,所有 AI 库也免费
- 跨平台 — Windows、Mac、Linux、Chromebook 都能跑
- 所有 AI 公司都先支持 Python — OpenAI、Google、Anthropic、Meta
🇬🇧 English
Python is the standard language for AI work. Four reasons:
- Simple syntax — reads like English, few symbols
- Free — Python and every AI library are free
- Cross-platform — runs on Windows, Mac, Linux, Chromebook
- Every AI company ships Python first — OpenAI, Google, Anthropic, Meta
💡 Versions — 版本
English: Use Python 3.10 or newer. Python 2 is dead. Don't use it. The agent script in this lesson uses features added in 3.10.
中文: 使用 Python 3.10 或更新版本。Python 2 已经停止维护,不要用。本课的脚本使用了 3.10 的新特性。
Installation — 安装步骤
1 Install Python — 安装 Python
🇨🇳 中文
- 访问 python.org/downloads
- 点击大黄色按钮 "Download Python 3.x"
- 运行下载好的
installer - Windows 用户重要! 安装时一定要勾选 "Add Python to PATH"
- 安装完成后,打开终端检查:
🇬🇧 English
- Go to python.org/downloads
- Click the big yellow "Download Python 3.x" button
- Run the installer
- Windows users — important! Check "Add Python to PATH" on the first install screen
- After install, open a terminal and check:
python --version
# Or on Mac/Linux / 或者在 Mac/Linux 上
python3 --version
2 Get a Free Gemini API Key — 获取免费的 Gemini API Key
🇨🇳 中文
- 访问 aistudio.google.com/apikey
- 用
Google account 登录(注册免费) - 点击 "Create API key"(创建 API 密钥)
- 把生成的密钥 复制下来,保存好
- 不要把 API key 发到网上!它就像你的密码
🇬🇧 English
- Go to aistudio.google.com/apikey
- Sign in with a Google account (free to make)
- Click "Create API key"
- Copy the key and save it somewhere safe
- Never post your API key online! It's like a password
💰 Cost — 费用
The free tier of Gemini gives you plenty of room to learn — far more than a class will use in a term. No credit card required.
Gemini 免费
3 Install the Gemini Library — 安装 Gemini 的库
🇨🇳 中文
用
🇬🇧 English
Use pip (Python's package manager) to install the library:
# Mac/Linux may need: / Mac/Linux 上可能需要:
pip3 install google-genai
4 Set Your API Key as an Environment Variable — 设置环境变量
🇨🇳 中文
这一步告诉脚本你的密钥是什么,而不用把密钥写进代码里(这样更安全):
🇬🇧 English
This tells the script your key without writing the key into the code (much safer):
⚠️ This is per-terminal — 这只对当前终端有效
If you close the terminal, you'll need to set the key again next time. To make it permanent, add the line to your shell profile (.bashrc, .zshrc) on Mac/Linux, or use System Environment Variables on Windows.
关闭终端后密钥就没了,下次要重新设。想长期保存,Mac/Linux 加到 .bashrc 或 .zshrc 文件里,Windows 用"系统环境变量"设置。
Download the Agent — 下载 Agent 脚本
🇨🇳 中文
整个 agent 只有一个 Python 文件,大约 70 行。下载后保存到任意
🇬🇧 English
The whole agent is one Python file, about 70 lines. Download it and save it anywhere you like.
📥 agent.py
One file. Three tools: read_file, write_file, run_shell.
一个文件,三个工具:读文件、写文件、运行命令。
🇨🇳 中文
下载后,在
🇬🇧 English
After downloading, open a terminal in that folder and run:
python agent.py
🇨🇳 中文
看到 Agent ready. Ask it something. 就说明启动成功了。直接
🇬🇧 English
When you see Agent ready. Ask it something., you're running. Type a question and press Enter.
How the Loop Works — 循环原理
🇨🇳 中文
这就是 agent 的 核心思想。它不是一次性的问答,而是一个
🇬🇧 English
This is the core idea. An agent isn't a one-shot Q&A — it's a loop that keeps calling tools, seeing the results, and deciding what to do next:
"Read agent.py and add a comment at the top." / "读 agent.py 然后在顶部加一行注释。"
It decides: "I need to call
read_file first." / 它决定先调用 read_file。The Python function runs. The result comes back as a string. / Python 函数运行,结果以字符串形式返回。
"Here's the file content." / "这是文件内容。"
"Now I need to call
write_file with the new content." / "现在我要调用 write_file 写入新内容。"When the model returns plain text instead of a tool call, the loop ends. / 当模型返回普通文字(不再调用工具)时,循环结束。
🎓 Pedagogical note — 教学注释
The Gemini SDK has a feature called "automatic function calling" that hides this loop from you. We turn it off on purpose (disable=True) so students can see the loop. The loop is the lesson.
Gemini SDK 有个"自动函数调用"功能,会把这个循环 隐藏 起来。我们 故意 把它关掉(disable=True),让学生能 看见 循环。循环本身就是这堂课的重点。
Try These Prompts — 试试这些指令
🇨🇳 中文
启动 agent 后,从简单的开始,逐步增加难度。每个任务都让你看到 agent 在做什么决定:
🇬🇧 English
Once the agent is running, start simple and work up. Each prompt shows you a different decision the agent has to make:
run_shell with ls or dir / 测试:用 run_shell 调用 ls 或 dirread_file + reasoning / 测试:read_file + 推理write_file / 测试:write_filewrite_file + run_shell / 测试:写文件 + 运行You will see real decisions. — 你会看到真正的决策。
For each task, the terminal will print every tool call before it runs. Watch the agent think, ask permission, and act.
每个任务里,终端都会在工具运行之前打印它的调用。看着 agent 思考、申请权限、然后动手。
Safety — 安全注意
🛑 Why we ask permission — 为什么要询问
English: The write_file and run_shell tools both ask "allow? [y/N]" before doing anything. This is the most important safety rule when building agents: the model proposes, the human approves. If a model goes wrong, the worst it can do is suggest a bad command — you say no.
中文: write_file 和 run_shell 这两个工具在动手之前都会问 "allow? [y/N]"。这是搭建 agent 时最重要的安全规则:模型提出建议,人来批准。即使模型出错,最多也只是提出一个糟糕的命令 — 你说"不"就行了。
🇨🇳 中文 — 给学生的几条规则
- 看清楚每个命令再说 y
- 不要让 agent 操作 系统文件(
/etc、C:\Windows) - 不要让它运行需要 管理员权限 的命令(
sudo、rm -rf) - API key 永远不要
commit 到 GitHub - 每一次回话都要重新设环境变量(除非你设成了永久的)
🇬🇧 English — A few rules for students
- Read every command before typing y
- Don't let the agent touch system files (
/etc,C:\Windows) - Don't let it run anything needing admin rights (
sudo,rm -rf) - Never commit your API key to GitHub
- You'll set the env var fresh each session (unless you made it permanent)
This is how real AI agents work. — 真正的 AI agent 就是这样工作的。
The agents inside Cursor, Claude Code, GitHub Copilot, and ChatGPT all run the same loop you just built. Different tools, more polish — but the same core idea.
Cursor、Claude Code、GitHub Copilot、ChatGPT 这些产品里的 agent,跑的就是 你刚才搭的同一个循环。工具更多、外观更精美,但核心原理一模一样。