🐍 Python Basics (การแปลภาษาไทยกำลังจะมา) · Python 编程基础

Learned by reading a real AI agent — agent_pro.py and its image-generation helper, image_gen.py. · 通过一个真实的 AI 智能体 —— agent_pro.py 及其图像生成模块 image_gen.py —— 来学习 Python。

This page teaches Python by walking through two short files that together make a working AI assistant. The assistant runs locally with the Hermes 3 model, can search the web, generate pictures, and send messages to LINE. Every concept below — imports, functions, type hints, docstrings, async/await, HTTP requests — appears in those files. We'll pull out the relevant lines, explain them, and at the end you'll see both files in full.

If you're new to Python: don't try to memorize the syntax. Read it like prose. Python is unusually close to English, and the goal of this page is to show you that you can already understand most of what's happening.

การแปลภาษาไทยกำลังจะมา (Thai translation coming soon)

本页通过详细阅读两段简短的 Python 代码来教你 Python 基础知识,这两段代码组合在一起就是一个可以运行的 AI 助手。这个助手在本地运行 Hermes 3 模型,可以搜索网页、生成图片、向 LINE 发送消息。本页涉及的所有概念 —— 导入、函数、类型提示、文档字符串、异步等待、HTTP 请求 —— 都出现在这两个文件中。我们会把相关代码片段抽出来,加以解释,最后你会看到两份完整的源代码。

如果你刚接触 Python:不要试图死记语法。把它当作普通文章一样阅读。Python 与英语极为接近,本页的目的就是让你看到 —— 你其实已经能看懂大部分代码了。

1. The shape of a Python file · โครงสร้างของไฟล์ Python · Python 文件的结构

A Python file is just text. When you run python agent_pro.py, the interpreter reads the file top-to-bottom and executes each line. Most files have three sections, in this order:

  1. Imports — bring in code from other modules.
  2. Definitions — declare functions, classes, constants.
  3. The entry point — what to actually do when the file is run.
1. Imports import asyncio from pathlib import Path 2. Definitions def multiply(a, b): ... def web_search(query): ... async def main(): ... 3. Entry point if __name__ == "__main__": asyncio.run(main())
The three sections of a Python file, in execution order.

You can see this skeleton clearly in agent_pro.py:

import asyncio
import os
from datetime import date
from pathlib import Path

from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.ollama import Ollama


def multiply(a: float, b: float) -> float:
    """Multiply two numbers and return the product."""
    return a * b

# ... more functions ...

if __name__ == "__main__":
    asyncio.run(main())

การแปลกำลังจะมา

Python 文件就是普通的文本文件。当你运行 python agent_pro.py 时,Python 解释器从上往下读这个文件,逐行执行。大多数文件都按照这样的顺序分为三个部分:

  1. 导入 —— 把其他模块里的代码引入进来。
  2. 定义 —— 声明函数、类、常量。
  3. 入口点 —— 当这个文件被运行时,真正要做的事
1. 导入 import asyncio from pathlib import Path 2. 定义 def multiply(a, b): ... def web_search(query): ... async def main(): ... 3. 入口点 if __name__ == "__main__": asyncio.run(main())
Python 文件的三个部分,按执行顺序排列。

agent_pro.py 中能清楚看到这种结构:

import asyncio
import os
from datetime import date
from pathlib import Path

from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.ollama import Ollama


def multiply(a: float, b: float) -> float:
    """Multiply two numbers and return the product."""
    return a * b

# ... 还有更多函数 ...

if __name__ == "__main__":
    asyncio.run(main())

2. Imports — borrowing other people's code · การ import — ยืมโค้ดของคนอื่น · 导入 —— 借用别人写的代码

Python comes with a "standard library" of useful modules built in, and you can install thousands more with pip install. Imports give you access to them. There are two forms:

import asyncio                       # whole module — use as asyncio.run(...)
from datetime import date            # one thing from a module — use as date.today()
from pathlib import Path             # same pattern
from llama_index.llms.ollama import Ollama  # nested package, deep import

The first form keeps the module's name as a prefix when you use it. The second form pulls a specific name into your file so you can use it bare. Both are fine; the choice is mostly about readability.

การแปลกำลังจะมา

Python 自带一个"标准库",里面有大量好用的模块;另外还可以用 pip install 安装数以千计的第三方包。导入语句让你能使用它们。导入有两种形式:

import asyncio                       # 导入整个模块 —— 使用时写 asyncio.run(...)
from datetime import date            # 从模块中导入一个名字 —— 直接写 date.today()
from pathlib import Path             # 同上
from llama_index.llms.ollama import Ollama  # 嵌套包的深层导入

第一种形式在使用时要带上模块名作为前缀。第二种形式把指定的名字直接引入到当前文件,使用时可以不带前缀。两种都没问题,选哪种主要看可读性。

3. Functions — the basic unit of Python code · ฟังก์ชัน · 函数 —— Python 代码的基本单位

A function is a named block of code that takes inputs and returns an output. Here's the simplest one in the file:

def multiply(a: float, b: float) -> float:
    """Multiply two numbers and return the product."""
    return a * b

Read it left-to-right:

  • def introduces a function definition.
  • multiply is the name we'll call it by.
  • (a: float, b: float) says it takes two inputs, both floating-point numbers.
  • -> float says it returns a floating-point number.
  • The """...""" line below is the docstring — documentation that lives inside the code.
  • return a * b hands the result back to whoever called it.

Indentation matters in Python. The lines that make up the function body are all indented the same amount (four spaces, by convention). When the indentation stops, the function ends. There are no curly braces.

การแปลกำลังจะมา

函数是一段有名字的代码块,接受输入并返回输出。文件中最简单的一个函数是:

def multiply(a: float, b: float) -> float:
    """Multiply two numbers and return the product."""
    return a * b

从左往右读这段代码:

  • def 表示要定义一个函数。
  • multiply 是函数的名字,调用时用这个名字。
  • (a: float, b: float) 表示它接受两个输入,都是浮点数。
  • -> float 表示返回值也是浮点数。
  • 下面的 """..."""文档字符串 —— 写在代码里的文档。
  • return a * b 把结果交还给调用者。

在 Python 中,缩进是有意义的。组成函数体的几行代码必须缩进同样的宽度(一般是四个空格)。缩进结束意味着函数也结束了。Python 不使用花括号。

4. Type hints — telling readers what a function expects · Type hints · 类型提示 —— 告诉读者函数期待什么

The : float and -> float annotations above are type hints. Python itself doesn't enforce them — you can technically pass a string to multiply and Python won't stop you until the multiplication actually fails. But they serve two important purposes:

  • They make the code self-documenting. A reader sees def days_until(iso_date: str) -> str and knows immediately what goes in and what comes out.
  • Tools like editors, linters, and (in our case) the LLM agent itself read type hints to know how to call the function correctly.

The agent in agent_pro.py learns the shape of each tool from its hints and docstring. That's why every tool here has both.

การแปลกำลังจะมา

上面代码中的 : float-> float 标注就是类型提示。Python 本身并不强制执行这些类型 —— 你完全可以传一个字符串给 multiply,Python 不会拦住你,直到真正做乘法运算时才会出错。但类型提示有两个重要作用:

  • 它们让代码自带说明。读者看到 def days_until(iso_date: str) -> str 就立刻知道传什么、返回什么。
  • 编辑器、代码检查工具,以及(在我们的例子里)大语言模型本身都会读类型提示,从而知道该怎么正确调用函数。

agent_pro.py 里的智能体就是通过工具函数的类型提示和文档字符串来理解每个工具的形状。所以这里每个工具函数都同时有两者。

5. Docstrings — documentation that lives with the code · Docstrings · 文档字符串

Look at web_search:

def web_search(query: str) -> str:
    """Search the web with DuckDuckGo and return the top 5 result snippets.

    Use this whenever the user asks about current events, prices, weather,
    sports scores, or anything you wouldn't already know.
    """
    from duckduckgo_search import DDGS

    with DDGS() as ddg:
        results = list(ddg.text(query, max_results=5))
    if not results:
        return "No results found."
    return "\n\n".join(
        f"{r['title']}\n{r['href']}\n{r['body']}" for r in results
    )

The triple-quoted string at the top is the docstring. It's a normal Python string — it just happens to be the first thing inside the function. Python stashes it on the function so other code can read it. In our case, the agent literally reads this docstring to decide when to use the tool. The line "Use this whenever the user asks about current events, prices, weather…" is instructions to the LLM, not just notes for humans.

การแปลกำลังจะมา

看一下 web_search

def web_search(query: str) -> str:
    """Search the web with DuckDuckGo and return the top 5 result snippets.

    Use this whenever the user asks about current events, prices, weather,
    sports scores, or anything you wouldn't already know.
    """
    from duckduckgo_search import DDGS

    with DDGS() as ddg:
        results = list(ddg.text(query, max_results=5))
    if not results:
        return "No results found."
    return "\n\n".join(
        f"{r['title']}\n{r['href']}\n{r['body']}" for r in results
    )

函数开头那段三引号字符串就是文档字符串。它本质上是一个普通的 Python 字符串,只不过恰好是函数里的第一个东西。Python 会把它存放在函数对象上,让其他代码可以读取。在我们的例子里,智能体会真的去读这段文档字符串,以此判断什么时候该使用这个工具。"用户问及时事、价格、天气、体育比分……"这句话就是写给大语言模型看的指令,并不只是给人类看的注释。

6. The standard library — batteries included · Standard library · 标准库 —— 自带的工具箱

Python comes with a remarkable amount of useful code already built in. agent_pro.py uses four standard-library modules:

import asyncio                  # event loops, async/await
import os                       # operating system stuff: env vars, paths
from datetime import date       # working with calendar dates
from pathlib import Path        # filesystem paths as objects

You don't need to install any of these. They ship with Python. Two examples of how they're used in the file:

# Read an environment variable, returning None if it's not set:
token = os.getenv("LINE_CHANNEL_ACCESS_TOKEN")

# Build a directory and a path to a file inside it:
Path("images").mkdir(exist_ok=True)
out_path = Path("images") / filename

That Path("images") / filename line is striking on its own. Path objects overload the / operator so that joining paths reads like… well, like a path. It works the same on Windows (images\out.png) and Linux (images/out.png) — Path handles the difference.

การแปลกำลังจะมา

Python 内置了大量好用的代码。agent_pro.py 用到了四个标准库模块:

import asyncio                  # 事件循环、async/await
import os                       # 操作系统相关:环境变量、路径等
from datetime import date       # 处理日历日期
from pathlib import Path        # 把文件系统路径作为对象处理

这些都不用单独安装,随 Python 一起就有。看两个文件里的实际用法:

# 读取一个环境变量,没设置时返回 None:
token = os.getenv("LINE_CHANNEL_ACCESS_TOKEN")

# 创建目录并构造路径:
Path("images").mkdir(exist_ok=True)
out_path = Path("images") / filename

Path("images") / filename 这行本身就很有意思。Path 对象重载了 / 运算符,让拼接路径读起来就像……写路径一样。它在 Windows(images\out.png)和 Linux(images/out.png)上效果一致 —— Path 帮你处理了平台差异。

7. Third-party libraries — installed with pip · Third-party libraries · 第三方库 —— 用 pip 安装

Anything outside the standard library has to be installed first:

pip install llama-index requests duckduckgo-search

Then you import them like any other module. agent_pro.py uses:

  • llama_index — builds the agent loop. We pull in ReActAgent, FunctionTool, and Ollama (the local-LLM client).
  • requests — the most widely-used HTTP client in Python. We'll see this one up close in the next section.
  • duckduckgo_search — a thin wrapper over DuckDuckGo's search results.

การแปลกำลังจะมา

标准库之外的任何东西都需要先安装:

pip install llama-index requests duckduckgo-search

然后像导入其他模块一样导入它们即可。agent_pro.py 用到了:

  • llama_index —— 构建智能体循环。我们从中导入 ReActAgentFunctionToolOllama(本地大模型客户端)。
  • requests —— Python 中使用最广泛的 HTTP 客户端。下一节会详细看它。
  • duckduckgo_search —— DuckDuckGo 搜索结果的轻量级封装。

8. Installing packages — pip and uv · การติดตั้งแพ็กเกจ · 安装第三方包 —— pipuv

Section 7 mentioned pip install in passing. It's worth a closer look, because installing packages is the single most common thing you do in a Python project — and there are two tools worth knowing.

8a. pip, the standard tool

pip ships with Python. Run these from your terminal (PowerShell, Terminal, or your IDE's built-in one), not inside Python:

pip install requests              # install one package
pip install -r requirements.txt   # install a list of packages from a file
pip uninstall requests            # remove one
pip list                          # see what's installed
pip show requests                 # details about a specific package

A requirements file is plain text listing one package per line, often with version pins:

llama-index>=0.10
requests==2.31.0
duckduckgo-search

Committing this file to your project means anyone else (and future you) can rebuild the same environment with one command.

8b. Virtual environments

Installing every package globally with pip works at first, but breaks fast: two projects need different versions of the same library and you can only have one. The fix is a virtual environment — a private folder containing its own Python and its own packages, isolated from the rest of your system.

python -m venv .venv                  # create a venv in ./.venv/
.venv\Scripts\activate                # activate it (Windows PowerShell)
source .venv/bin/activate             # activate it (macOS / Linux)
pip install requests                  # installs *only* into .venv now
deactivate                            # return to the system Python

You make one venv per project. The folder is conventionally called .venv and is added to .gitignore — it gets rebuilt from requirements.txt wherever the code is checked out.

8c. uv, the new fast alternative

uv is a newer tool written in Rust that does what pip and venv do, but dramatically faster — often 10× to 100× — and bundles them into one workflow. Install it once:

pip install uv

Then use these instead of pip / venv:

uv venv                               # create a .venv
uv pip install requests               # install into it (no activation needed)
uv pip install -r requirements.txt    # same input file as pip, much faster
uv run python agent_pro.py            # run a script inside the venv

The uv run ... form is the killer feature: it picks up the project's venv automatically, installs anything missing, and runs your script — no manual activate step.

การแปลกำลังจะมา

第 7 节顺带提了一下 pip install。值得详细看看,因为安装第三方包是 Python 项目中最常做的事情 —— 而值得了解的工具有两个。

8a. pip:标准工具

pip 随 Python 一起安装。下面这些命令在终端(PowerShell、Terminal 或 IDE 内置终端)里运行,不是在 Python 里:

pip install requests              # 安装一个包
pip install -r requirements.txt   # 根据列表批量安装
pip uninstall requests            # 卸载一个包
pip list                          # 查看已安装的包
pip show requests                 # 查看某个包的详细信息

requirements.txt 是一个纯文本文件,每行一个包,常带版本号:

llama-index>=0.10
requests==2.31.0
duckduckgo-search

把这个文件提交到项目里,别人(以及未来的你自己)只要一条命令就能搭起完全相同的运行环境。

8b. 虚拟环境

把所有包都装到全局 Python 里在一开始好用,但很快就会出问题:两个项目需要同一个库的不同版本,而你只能装一个。解决办法就是虚拟环境 —— 一个私有的目录,里面装着自己的 Python 和自己的包,跟系统的其他部分完全隔离。

python -m venv .venv                  # 在 ./.venv/ 里创建虚拟环境
.venv\Scripts\activate                # 激活(Windows PowerShell)
source .venv/bin/activate             # 激活(macOS / Linux)
pip install requests                  # 这次只会装到 .venv 里
deactivate                            # 退出虚拟环境

每个项目用一个虚拟环境。习惯上把它叫 .venv,并加入 .gitignore —— 代码被别人 clone 后会从 requirements.txt 重新构建。

8c. uv:更快的新工具

uv 是用 Rust 写的新工具,做 pipvenv 的事,但速度显著更快 —— 通常 10× 到 100× —— 并把它们整合到了一个工作流里。先装一次:

pip install uv

然后用下面这些命令代替 pipvenv

uv venv                               # 创建 .venv
uv pip install requests               # 装到 .venv(无需激活)
uv pip install -r requirements.txt    # 与 pip 同样的输入文件,但快很多
uv run python agent_pro.py            # 在虚拟环境里运行脚本

uv run ... 是它最有用的特性:自动找到项目的虚拟环境,自动安装缺失的依赖,然后运行你的脚本 —— 不需要手动激活。

9. Talking to an API — image_gen.py · การเรียก API · 调用 API —— image_gen.py 详解

This is the section the rest of the page was building toward. APIs ("Application Programming Interfaces") are how programs talk to other programs over the network. Most modern AI services — OpenAI, Anthropic, Google's Gemini, Stripe, LINE — expose APIs that take JSON in and send JSON back. If you can read image_gen.py, you can use any of them.

Here's the entire file. Forty lines, including blank lines and the docstring. We'll go through it piece by piece.

"""image_gen.py — Gemini Nano Banana 2 backend for agent_pro.py.

Drops in for the Mac-Stack SDXL helper. Exposes generate(prompt, out_path)
so agent_pro.py's generate_image tool keeps working unchanged.
"""
import base64
from pathlib import Path

import requests

_KEY_FILE = Path(r"C:\Users\Admin\claude\env.txt.txt")
_ENDPOINT = (
    "https://generativelanguage.googleapis.com/v1beta/models/"
    "gemini-3.1-flash-image-preview:generateContent"
)


def _load_key() -> str:
    for line in _KEY_FILE.read_text(encoding="utf-8").splitlines():
        name, _, value = line.partition(":")
        if name.strip() == "google-api":
            return value.strip()
    raise RuntimeError(f"google-api entry not found in {_KEY_FILE}")


def generate(prompt: str, out_path: str) -> str:
    r = requests.post(
        f"{_ENDPOINT}?key={_load_key()}",
        json={
            "contents": [{"parts": [{"text": prompt}]}],
            "generationConfig": {"responseModalities": ["IMAGE"]},
        },
        timeout=120,
    )
    r.raise_for_status()
    parts = r.json()["candidates"][0]["content"]["parts"]
    img_b64 = next(p["inlineData"]["data"] for p in parts if "inlineData" in p)
    Path(out_path).write_bytes(base64.b64decode(img_b64))
    return out_path

9a. Constants at the top

Two conventions to notice:

  • The leading underscore (_KEY_FILE, _ENDPOINT) is a Python convention meaning "this is private to this file; don't import it from elsewhere." The interpreter doesn't enforce it — it's a polite signal to other humans.
  • The r"..." prefix is a "raw string." It tells Python not to treat backslashes specially. Useful for Windows paths because "C:\Users\..." without the r would have to be written "C:\\Users\\...".
  • Two adjacent string literals — "...models/" "gemini-3.1..." — get joined automatically. This is a clean way to wrap long URLs without having to use + for concatenation.

9b. Reading an API key from a file

_load_key walks each line of the secrets file, splits on the first colon with partition, and returns the value for the entry named google-api. The underscore in name, _, value = line.partition(":") means "I don't care about this value" — partition returns three pieces (left, separator, right) and we don't need the separator.

9c. Making the HTTP request

requests.post(...) sends an HTTP POST request and returns a response object. There's almost a one-to-one mapping between the arguments and what's happening on the network:

  • The URL uses an "f-string" — Python substitutes the expressions inside {} at runtime.
  • The body: json={...} — pass a Python dictionary; requests serializes it to JSON and sets Content-Type: application/json. The shape comes from the Gemini API docs.
  • The timeout: timeout=120. Without this, a hanging server could freeze your program forever. Always set a timeout.

9d. Handling the response

  1. r.raise_for_status() — if the server returned an error code (4xx or 5xx), raise an exception. Without this, the next line would try to read the error message as a valid response.
  2. r.json() parses the response body as JSON. The chain ["candidates"][0]["content"]["parts"] reaches into Gemini's nested structure.
  3. The next(...) line finds the first image part. Gemini can return text and image parts; we want the image. p["inlineData"]["data"] is the image as a base64 string.
  4. base64.b64decode(img_b64) converts base64 text back into raw bytes — actual JPEG file contents. Path(out_path).write_bytes(...) writes them to disk.

การแปลกำลังจะมา

这一节是整篇文章的核心。API("应用程序编程接口")是程序之间通过网络通信的方式。当前大部分 AI 服务 —— OpenAI、Anthropic、Google 的 Gemini、Stripe、LINE —— 都提供 API,接收 JSON 输入并返回 JSON。如果你能看懂 image_gen.py,就能用它们所有的。

下面是整个文件。一共四十行,包括空行和文档字符串。我们一段一段来看。

"""image_gen.py — Gemini Nano Banana 2 backend for agent_pro.py."""
import base64
from pathlib import Path

import requests

_KEY_FILE = Path(r"C:\Users\Admin\claude\env.txt.txt")
_ENDPOINT = (
    "https://generativelanguage.googleapis.com/v1beta/models/"
    "gemini-3.1-flash-image-preview:generateContent"
)


def _load_key() -> str:
    for line in _KEY_FILE.read_text(encoding="utf-8").splitlines():
        name, _, value = line.partition(":")
        if name.strip() == "google-api":
            return value.strip()
    raise RuntimeError(f"google-api entry not found in {_KEY_FILE}")


def generate(prompt: str, out_path: str) -> str:
    r = requests.post(
        f"{_ENDPOINT}?key={_load_key()}",
        json={
            "contents": [{"parts": [{"text": prompt}]}],
            "generationConfig": {"responseModalities": ["IMAGE"]},
        },
        timeout=120,
    )
    r.raise_for_status()
    parts = r.json()["candidates"][0]["content"]["parts"]
    img_b64 = next(p["inlineData"]["data"] for p in parts if "inlineData" in p)
    Path(out_path).write_bytes(base64.b64decode(img_b64))
    return out_path

9a. 文件顶部的常量

这里有几个值得注意的约定:

  • 名字前面的下划线(_KEY_FILE_ENDPOINT)是 Python 的一种约定,意思是"这个东西只在本文件内使用,别从外部 import 它"。
  • r"..." 前缀表示"原始字符串",告诉 Python 不要把反斜杠当作特殊字符。
  • 相邻的两个字符串字面量 —— "...models/" "gemini-3.1..." —— 会被自动拼接。这是写长 URL 而不用 + 拼接的优雅写法。

9b. 从文件中读取 API 密钥

_load_key 遍历每一行,用 partition 在第一个冒号处分开,返回名为 google-api 的那个值。name, _, value = line.partition(":") 中的下划线意思是"这个值我不关心"。

9c. 发出 HTTP 请求

requests.post(...) 发出一个 HTTP POST 请求,返回一个响应对象。这里的参数与网络上发生的事几乎一一对应:

  • URL:用 "f-string" 在运行时把 {} 里的表达式替换成实际值。
  • 请求体json={...} —— 传入一个 Python 字典,requests 会序列化为 JSON 并加 Content-Type: application/json 头。
  • 超时timeout=120。不设超时的话,服务器卡住会让你的程序永远卡住。一定要设超时。

9d. 处理响应

  1. r.raise_for_status() —— 如果服务器返回了错误状态码(4xx 或 5xx),就抛出异常。
  2. r.json() 把响应体解析为 JSON 并返回一个 Python 字典。
  3. next(...) 找到响应中第一个包含图像数据的 part。
  4. base64.b64decode(img_b64) 把 base64 字符串变回原始字节 —— 也就是真正的 JPEG 文件内容。

10. Getting more API keys — NVIDIA and HuggingFace · API keys อื่นๆ · 获取更多 API 密钥 —— NVIDIA 和 HuggingFace

Section 9 showed how image_gen.py makes an HTTP request to Google's Gemini service. Two other services worth knowing offer generous free tiers: NVIDIA build.nvidia.com and HuggingFace. Sign up for both — they take about a minute each.

10a. NVIDIA build.nvidia.com

  1. Go to build.nvidia.com
  2. Click Login (top right)
  3. Click any model card (e.g. Llama 3.3 70B or FLUX.1)
  4. Click Get API Key — key starts with nvapi-...
import base64, requests

API_KEY = "nvapi-..."   # load from your secrets file, not inline
r = requests.post(
    "https://ai.api.nvidia.com/v1/genai/black-forest-labs/flux.1-schnell",
    headers={"Authorization": f"Bearer {API_KEY}", "Accept": "application/json"},
    json={"prompt": "a Lanna temple at sunrise", "width": 1024, "height": 1024, "seed": 0},
    timeout=60,
)
r.raise_for_status()
img_b64 = r.json()["artifacts"][0]["base64"]
open("temple.png", "wb").write(base64.b64decode(img_b64))

Notice the shape: POST, JSON body, base64 image in the response, decode and write. Exactly what image_gen.py does, with different field names. Once you've read one API, you've largely read them all.

10b. HuggingFace

  1. Go to huggingface.co/join
  2. Settings → Access Tokens, click New token, choose Read
  3. Token starts with hf_...
pip install huggingface_hub
hf auth login    # paste your hf_... token when prompted

10c. What you can build with these keys

You want to…Try these modelsWhere
Generate an image from textFLUX, Stable Diffusion 3.5, Nano BananaNVIDIA · HF · Gemini
Edit or inpaint an imageFLUX Fill, Qwen-Image-EditNVIDIA · HF
Generate a short video clipSeedance 2.0, CogVideoXReplicate · HF
Text-to-speech (voice)XTTS-v2, Coqui, RivaHF · NVIDIA · ElevenLabs
Speech-to-textWhisper Large v3NVIDIA · HF · OpenAI
TranslateNLLB, Madlad-400HF
Chat / summarize / extractLlama 3.3, DeepSeek, Hermes 3NVIDIA · HF · your own Ollama (free, local)

10d. Keeping keys safe — non-negotiable

  1. Never paste a key into source code that gets committed to git. Bots scan public repos for patterns like AIza..., sk-..., nvapi-... within seconds of a push, and use found keys to run up huge bills.
  2. Use scoped tokens where the service offers them.
  3. Rotate keys when something looks weird.

การแปลกำลังจะมา

第 9 节展示了 image_gen.py 怎么向 Google 的 Gemini 发出 HTTP 请求。两个特别值得了解、并且都提供慷慨免费额度的服务是:NVIDIA build.nvidia.comHuggingFace。注册一下,每个大概一分钟。

10a. NVIDIA build.nvidia.com

  1. 打开 build.nvidia.com
  2. 点击右上角 Login
  3. 点任意一个模型卡片(比如 Llama 3.3 70BFLUX.1
  4. Get API Key,密钥以 nvapi-... 开头
import base64, requests

API_KEY = "nvapi-..."   # 从密钥文件读取,不要写死在代码里
r = requests.post(
    "https://ai.api.nvidia.com/v1/genai/black-forest-labs/flux.1-schnell",
    headers={"Authorization": f"Bearer {API_KEY}", "Accept": "application/json"},
    json={"prompt": "a Lanna temple at sunrise", "width": 1024, "height": 1024, "seed": 0},
    timeout=60,
)
r.raise_for_status()
img_b64 = r.json()["artifacts"][0]["base64"]
open("temple.png", "wb").write(base64.b64decode(img_b64))

注意结构:POST、JSON 请求体、base64 编码的图像响应、解码并写入文件。和 image_gen.py 做的事情完全一样,只是字段名不同。看懂一个 API,差不多就看懂了所有 API。

10b. HuggingFace

  1. 打开 huggingface.co/join
  2. 进入 Settings → Access Tokens,点 New token,选 Read
  3. token 以 hf_... 开头
pip install huggingface_hub
hf auth login    # 提示时粘贴你的 hf_... token

10c. 用这些密钥能做什么

你想要……可以试试的模型提供方
根据文本生成图像FLUX、Stable Diffusion 3.5、Nano BananaNVIDIA · HF · Gemini
编辑、修补图像FLUX Fill、Qwen-Image-EditNVIDIA · HF
生成短视频片段Seedance 2.0、CogVideoXReplicate · HF
文本转语音XTTS-v2、Coqui、RivaHF · NVIDIA · ElevenLabs
语音转文本Whisper Large v3NVIDIA · HF · OpenAI
翻译NLLB、Madlad-400HF
聊天、总结、信息提取Llama 3.3、DeepSeek、Hermes 3NVIDIA · HF · 自己跑 Ollama(免费、本地)

10d. 妥善保管密钥 —— 没得商量

  1. 永远不要把密钥粘贴到提交进 git 的源代码里。机器人会扫描公共仓库里类似 AIza...sk-...nvapi-... 的模式,几秒钟之内就能找到泄露的密钥并用它跑出巨额账单。
  2. 使用最小权限的 token。
  3. 发现异常立刻轮换。

11. A small example — make a QR code · ตัวอย่างเล็กๆ — สร้าง QR code · 一个小例子 —— 用第三方包做点有用的事

To put the package-installation lesson into practice, here's a tiny standalone function that uses a third-party package to generate a QR code. Useful for handouts: print a QR code on a worksheet, students scan it to reach a website or video.

pip install "qrcode[pil]"

The [pil] tells pip to also pull in Pillow (the Python imaging library), which qrcode uses to draw the actual image.

def make_qr(text: str, out_path: str = "qr.png") -> str:
    """Encode some text or a URL as a QR code and save it as a PNG."""
    import qrcode

    img = qrcode.make(text)
    img.save(out_path)
    return out_path

Three lines of real work — the typical shape of a small Python function once you have the right package. You'd call it like:

make_qr("https://krueng.ai/lesson-5", "lesson5_qr.png")

การแปลกำลังจะมา

把刚学的"安装第三方包"应用一下:下面这个独立函数用一个第三方包生成 QR 码。在把链接做成纸质材料时很有用 —— 把它印在练习纸上,学生用手机扫一下就能打开网页或视频。

pip install "qrcode[pil]"

方括号里的 [pil] 告诉 pip 同时把 Pillow(Python 的图像库)也装上 —— qrcode 用它来真正画出图像。

def make_qr(text: str, out_path: str = "qr.png") -> str:
    """Encode some text or a URL as a QR code and save it as a PNG."""
    import qrcode

    img = qrcode.make(text)
    img.save(out_path)
    return out_path

真正干活的就三行 —— 用对了包之后,小 Python 函数大致都长这样。调用起来是这样:

make_qr("https://krueng.ai/lesson-5", "lesson5_qr.png")

12. Putting it together — the agent loop · ทุกอย่างรวมกัน · 整合起来 —— 智能体循环

Once you understand the pieces, the main function in agent_pro.py reads in one breath:

async def main() -> None:
    agent = build_agent()
    print("Hermes Pro agent ready. Empty line to quit.\n")
    while True:
        try:
            q = input("you > ").strip()
        except (EOFError, KeyboardInterrupt):
            print()
            return
        if not q:
            return
        response = await agent.run(q)
        print(f"\nagent > {response}\n")

Build the agent, then in a loop: read a line from the user, send it to the agent, print the response. try/except catches Ctrl-C and Ctrl-D so they don't crash the program. if not q: return exits on an empty line.

The async and await keywords

The function is declared with async def instead of plain def, and it uses await on the line that calls the agent. The ReActAgent from LlamaIndex is built on Python's asynchronous machinery. When you call agent.run(q), it returns immediately — not with an answer, but with a handle to work-in-progress. await tells Python "pause this function until the work finishes, then give me the result." For that to be possible, the surrounding function has to be marked async. And to run an async function from a normal script, you need asyncio.run(...):

if __name__ == "__main__":
    asyncio.run(main())

You don't need to fully understand async to use it. The rule of thumb: if a library says "you must await this," wrap your caller in async def, use await, and start it all with asyncio.run(...).

การแปลกำลังจะมา

各个部分搞清楚之后,agent_pro.py 里的 main 函数一口气就能读完:

async def main() -> None:
    agent = build_agent()
    print("Hermes Pro agent ready. Empty line to quit.\n")
    while True:
        try:
            q = input("you > ").strip()
        except (EOFError, KeyboardInterrupt):
            print()
            return
        if not q:
            return
        response = await agent.run(q)
        print(f"\nagent > {response}\n")

构建智能体,然后循环:从用户那里读一行,发给智能体,把回答打印出来。try/except 捕获 Ctrl-C 和 Ctrl-D,避免程序崩溃。if not q: return 在收到空行时退出。

asyncawait 关键字

这个函数用 async def 而不是普通的 def 定义,调用智能体的那一行使用了 await。LlamaIndex 的 ReActAgent 建立在 Python 的异步机制之上。调用 agent.run(q) 时它会立即返回 —— 但返回的不是答案,而是一个"正在进行中"的句柄。await 告诉 Python "暂停这个函数,直到工作完成,然后把结果给我"。要从普通脚本里运行一个 async 函数,需要 asyncio.run(...)

if __name__ == "__main__":
    asyncio.run(main())

你不需要完全理解异步也能用它。规则是:如果某个库说"必须 await",就把调用者包在 async def 里,使用 await,最后用 asyncio.run(...) 启动。

13. Where to write Python — editors and IDEs · โปรแกรมเขียนโค้ด · 在哪里写 Python —— 编辑器和 IDE

Notepad works. But an IDE (Integrated Development Environment) gives you syntax highlighting, autocomplete, inline error checking, a debugger, and a built-in terminal — all of which save real time. Four worth considering:

VS Code

Free, made by Microsoft, by far the most popular editor for Python today. Install the official "Python" extension. Cross-platform. The safe default — start here.

PyCharm Community Edition

Free, made by JetBrains. Heavier than VS Code but everything is configured out of the box. Particularly good for refactoring and navigating large codebases.

Cursor

Free for basic use, a fork of VS Code with built-in AI chat. If you're already learning with AI tools, having a chat window that can read the file you're editing is genuinely useful.

Thonny

Free, designed specifically for beginners. Ships with its own Python. The variable explorer (a panel showing what's in memory at each step) is a real teaching aid.

การแปลกำลังจะมา

记事本能用,任何文本编辑器都能用。但集成开发环境(IDE)会给你语法高亮、自动补全、实时错误检查、调试器、内置终端 —— 都能省下大量时间。四个值得考虑的:

VS Code

免费、微软出品,目前最流行的 Python 编辑器。在扩展市场安装官方的 "Python" 扩展。跨平台。它是稳妥的默认选择 —— 没有特别偏好就从这个开始。

PyCharm 社区版

免费、JetBrains 出品。比 VS Code 重量级一些 —— 它是完整的 IDE 而不是带扩展的编辑器 —— 但一切开箱即用。特别适合重构和浏览大型代码库。

Cursor

基础功能免费,是 VS Code 的分支,内置 AI 聊天。如果你已经在用 AI 工具学编程,一个能读取当前文件的聊天窗口很有用。

Thonny

免费、专为初学者设计。自带 Python,不需要单独安装。变量浏览器对刚培养代码直觉的人是真正有用的教学工具。

14. What to read next · อ่านอะไรต่อ · 接下来读什么

If you want to go deeper, in this order:

  • python_basics.ipynb — the same material as this page, but as a runnable notebook. Change a cell, press Shift+Enter, watch it run.
  • python_next_steps.ipynb — modules, file I/O, try / except, and list comprehensions. Takes you from "I can read Python" to "I can write small useful scripts."
  • python_apis_async.ipynb — functions in depth (defaults, *args, lambdas, decorators), calling HTTP APIs with requests, and async / await.
  • The official Python tutorial — docs.python.org/3/tutorial. Short, dense, and authoritative.
  • The requests library quickstart — most APIs you'll ever use are reachable through the same four or five lines you saw above.
  • Real Python — practical articles aimed at intermediate learners.
  • The asyncio docs, once you've written enough sync code to feel the need for async.

But before any of that, the most useful thing you can do is read your own code out loud, the way we did with these files. The patterns repeat.

การแปลกำลังจะมา

想深入下去,按顺序读:

  • python_basics.ipynb —— 本页内容的可运行 notebook 版本。改一个单元格,按 Shift+Enter,看它运行。
  • python_next_steps.ipynb —— 模块、文件读写、try / except、列表推导式。带你从"能看懂 Python"走到"能写点有用的小脚本"。
  • python_apis_async.ipynb —— 函数进阶(默认参数、*args、lambda、装饰器)、用 requests 调 HTTP API、async / await
  • Python 官方教程 —— docs.python.org/3/tutorial。简短、扎实、权威。
  • requests 库的 quickstart —— 几乎所有你将来会用到的 API 都用得着上面那五六行代码。
  • Real Python —— 面向中级学习者的实用教程网站。
  • asyncio 的官方文档 —— 等你写够了同步代码、真正感到需要异步的时候再读。

但在那之前,最有用的练习是 —— 像我们这样把自己的代码逐句读一遍。模式都是重复的。

Appendix A — Full source: agent_pro.py · ภาคผนวก A · 附录 A —— agent_pro.py 完整源码

"""agent_pro.py — Hermes agent with web search, image gen, and LINE tools.

Builds on agent.py from the Mac Stack. Same Ollama + Hermes 3 + LlamaIndex
foundation. Three new tools let the agent reach the world:

    web_search       — DuckDuckGo, no API key
    generate_image   — local Stable Diffusion XL Turbo on Apple Silicon
    send_line_push   — send a LINE message via the Messaging API

Before running:
    ollama pull hermes3:8b
    pip install -r requirements.txt

Then:
    python agent_pro.py
"""
import asyncio
import os
from datetime import date
from pathlib import Path

from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.ollama import Ollama


def multiply(a: float, b: float) -> float:
    """Multiply two numbers and return the product."""
    return a * b


def days_until(iso_date: str) -> str:
    """How many days between today and a YYYY-MM-DD date."""
    target = date.fromisoformat(iso_date)
    delta = (target - date.today()).days
    if delta == 0:
        return "today"
    return f"{delta} day(s) {'from now' if delta > 0 else 'ago'}"


def web_search(query: str) -> str:
    """Search the web with DuckDuckGo and return the top 5 result snippets."""
    from duckduckgo_search import DDGS

    with DDGS() as ddg:
        results = list(ddg.text(query, max_results=5))
    if not results:
        return "No results found."
    return "\n\n".join(
        f"{r['title']}\n{r['href']}\n{r['body']}" for r in results
    )


def generate_image(prompt: str, filename: str = "out.png") -> str:
    """Generate an image from a text prompt and save it under ./images/."""
    from image_gen import generate as _gen_image

    Path("images").mkdir(exist_ok=True)
    out_path = Path("images") / filename
    _gen_image(prompt, str(out_path))
    return f"Saved image to {out_path}"


def send_line_push(user_id: str, message: str) -> str:
    """Send a LINE push message to a user, group, or room."""
    import requests

    token = os.getenv("LINE_CHANNEL_ACCESS_TOKEN")
    if not token:
        return "LINE_CHANNEL_ACCESS_TOKEN not set in environment."
    r = requests.post(
        "https://api.line.me/v2/bot/message/push",
        headers={
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json",
        },
        json={"to": user_id, "messages": [{"type": "text", "text": message}]},
        timeout=10,
    )
    return f"LINE push status: {r.status_code} {r.text[:200]}"


TOOLS = [
    FunctionTool.from_defaults(multiply),
    FunctionTool.from_defaults(days_until),
    FunctionTool.from_defaults(web_search),
    FunctionTool.from_defaults(generate_image),
    FunctionTool.from_defaults(send_line_push),
]


def build_agent() -> ReActAgent:
    llm = Ollama(model="hermes3:8b", request_timeout=300)
    return ReActAgent(tools=TOOLS, llm=llm, verbose=True)


async def main() -> None:
    agent = build_agent()
    print("Hermes Pro agent ready. Empty line to quit.\n")
    while True:
        try:
            q = input("you > ").strip()
        except (EOFError, KeyboardInterrupt):
            print()
            return
        if not q:
            return
        response = await agent.run(q)
        print(f"\nagent > {response}\n")


if __name__ == "__main__":
    asyncio.run(main())

Appendix B — Full source: image_gen.py · ภาคผนวก B · 附录 B —— image_gen.py 完整源码

"""image_gen.py — Gemini Nano Banana 2 backend for agent_pro.py."""
import base64
from pathlib import Path

import requests

_KEY_FILE = Path(r"C:\Users\Admin\claude\env.txt.txt")
_ENDPOINT = (
    "https://generativelanguage.googleapis.com/v1beta/models/"
    "gemini-3.1-flash-image-preview:generateContent"
)


def _load_key() -> str:
    for line in _KEY_FILE.read_text(encoding="utf-8").splitlines():
        name, _, value = line.partition(":")
        if name.strip() == "google-api":
            return value.strip()
    raise RuntimeError(f"google-api entry not found in {_KEY_FILE}")


def generate(prompt: str, out_path: str) -> str:
    r = requests.post(
        f"{_ENDPOINT}?key={_load_key()}",
        json={
            "contents": [{"parts": [{"text": prompt}]}],
            "generationConfig": {"responseModalities": ["IMAGE"]},
        },
        timeout=120,
    )
    r.raise_for_status()
    parts = r.json()["candidates"][0]["content"]["parts"]
    img_b64 = next(p["inlineData"]["data"] for p in parts if "inlineData" in p)
    Path(out_path).write_bytes(base64.b64decode(img_b64))
    return out_path