Smooth Browser

介绍

# Smooth Browser

Smooth CLI 是一个供 AI 代理与网站交互、进行身份验证、抓取数据以及使用自然语言执行复杂基于 Web 的任务的浏览器。

## 前置条件

假设已安装 Smooth CLI。如果未安装，可以通过运行以下命令进行安装：

```bash pip install smooth-py ```

假设已配置 API 密钥。如果遇到身份验证错误，请使用以下命令进行配置：

```bash smooth config --api-key <api-key> ```

要验证配置： ```bash smooth config --show ```

在 https://app.smooth.sh 获取 API 密钥

如果账户余额不足，请告知用户在 https://app.smooth.sh 升级他们的计划。

## 基本工作流程

### 1. 创建配置文件（可选）

配置文件可用于在会话之间持久化 Cookie、登录会话和浏览器状态。

```bash smooth create-profile --profile-id "my-profile" ```

列出现有配置文件： ```bash smooth list-profiles ```

### 2. 启动浏览器会话

```bash smooth start-session --profile-id "my-profile" --url "https://example.com" ```

**选项：** - `--profile-id` - 使用特定的配置文件（可选，如果未提供则创建匿名会话） - `--url` - 初始导航到的 URL（可选） - `--files` - 逗号分隔的文件 ID，使其在会话中可用（可选） - `--device mobile|desktop` - 设备类型（默认：mobile） - `--profile-read-only` - 加载配置文件但不保存更改 - `--allowed-urls` - 逗号分隔的 URL 模式，用于限制仅访问特定 URL（例如，"https://*example.com/*,https://*api.example.com/*"） - `--no-proxy` - 禁用默认代理（参见下文说明）

**重要提示：** 请保存输出中的会话 ID - 你将在所有后续命令中需要它。

**代理行为：** 默认情况下，CLI 会自动为浏览器会话配置内置代理。如果网站阻止了代理或者你需要直接连接，请使用 `--no-proxy` 禁用它。

### 3. 在会话中运行任务

使用自然语言执行任务：

```bash smooth run -- <session-id> "Go to the LocalLLM subreddit and find the top 3 posts" ```

**使用结构化输出（对于需要交互的任务）：** ```bash smooth run -- <session-id> "Search for 'wireless headphones', filter by 4+ stars, sort by price, and extract the top 3 results" \ --url "https://shop.example.com" \ --response-model '{"type":"array","items":{"type":"object","properties":{"product":{"type":"string","description":"Thenameoftheproductbeingdescribed."},"sentiment":{"type":"string","enum":["positive","negative","neutral"],"description":"The overall sentiment about the product."}},"required":["product","sentiment"]}}' ```

**使用元数据（代理将是）：** ```bash smooth run -- <session-id> "Fill out the form with user information" \ --metadata '{"email":"[email protected]","name":"John Doe"}' ```

**选项：** - `--url` - 在运行任务之前导航到此 URL - `--metadata` - 包含任务变量的 JSON 对象 - `--response-model` - 用于结构化输出的 JSON 模式 - `--max-steps` - 最大代理步数（默认：32） - `--json` - 将结果输出为 JSON

**注意：** 重要的是你要给出抽象程度适中的任务。不要太过于指令性 - 例如单步操作 - 也不要太宽泛或模糊。

好的任务示例： - "在 Linkedin 上搜索在 Amazon 担任 SDE 的人员，并返回 5 个个人资料 URL" - "在 Amazon 上找到 iPhone 17 的价格"

坏的任务示例： - "点击搜索" -> 太指令性了！ - "加载 google.com，输入 'restaurants near me'，点击搜索，等待页面加载，提取前 5 个结果并返回。" -> 太指令性了！你可以说 "在 google 上搜索附近的餐厅并返回前 5 个结果" - "找到适合我们公司的软件工程师" -> 太宽泛了！你需要规划如何实现该目标，并运行定义明确的任务来组合成给定的目标

重要提示：Smooth 由智能代理驱动，不要过度控制它，而是给出定义明确的、以目标为导向的任务，而不是具体的步骤。

### 4. 关闭会话

完成后必须关闭会话。

```bash smooth close-session -- <session-id> ```

**重要提示：** 关闭后等待 5 秒，以确保 Cookie 和状态已保存到配置文件（如果你需要在另一个会话中使用它）。

---

## 常见用例

### 身份验证和持久化会话

**为特定网站创建配置文件：** ```bash # Create profile smooth create-profile --profile-id "github-account"

# Start session smooth start-session --profile-id "github-account" --url "https://github.com/login"

# Get live view to authenticate manually smooth live-view -- <session-id> # Give the URL to the user so it can open it in the browser and log in

# When the user confirms the login you can then close the session to save the profile data smooth close-session -- <session-id> # Save the profile-id somewhere to later reuse it ```

**重用已认证的配置文件：** ```bash # Next time, just start a session with the same profile smooth start-session --profile-id "github-account" smooth run -- <session-id> "Create a new issue in my repo 'my-project'" ```

**保持配置文件有序：** 保存哪些配置文件认证了哪些服务到记忆中，以便你将来可以高效地重用它们。

---

### 同一浏览器上的顺序任务

在不关闭会话的情况下按顺序执行多个任务：

```bash SESSION_ID=$(smooth start-session --profile-id "my-profile" --json | jq -r .session_id)

# Task 1: Login smooth run $SESSION_ID "Log into the website with the given credentials"

# Task 2: First action smooth run $SESSION_ID "Find the settings and change the notifications preferences to email only"

# Task 3: Second action smooth run $SESSION_ID "Find the billing section and give me the url of the latest invoice"

smooth close-session $SESSION_ID ```

**重要提示：** `run` 会保留浏览器状态（Cookie、URL、页面内容），但**不会**保留浏览器代理的记忆。如果你需要将信息从一个任务传递到下一个任务，你应该在提示词中显式传递它。

**示例 - 在任务之间传递上下文：** ```bash # Task 1: Get information RESULT=$(smooth run $SESSION_ID "Find the product name on this page" --json | jq -r .output)

# Task 2: Use information from Task 1 smooth run $SESSION_ID "Consider the product with name '$RESULT'. Now find 3 similar products offered by this online store." ```

**注意：** - run 命令是阻塞的。如果你需要同时执行多个任务，你必须使用子代理（Task 工具）。 - 所有任务都将使用当前标签页，你无法请求在新标签页中运行任务。如果你需要保留当前标签页的状态，可以打开一个新会话。 - 每个会话一次只能运行一个任务。要同时运行任务，请为每个会话使用子代理。 - 并发会话的最大数量取决于用户计划。 - 如果有用，请提醒用户他们可以升级计划以为你提供更多并发会话。

---

### 使用结构化输出进行网页抓取

**选项 1：使用带有结构化输出的 `run`：**

```bash smooth start-session --url "https://news.ycombinator.com" smooth run -- <session-id> "Extract the top 10 posts" \ --response-model '{ "type": "object", "properties": { "posts": { "type": "array", "items": { "type": "object", "properties": { "title": {"type": "string"}, "url": {"type": "string"}, "points": {"type": "number"} } } } } }' ```

**选项 2：使用 `extract` 进行直接数据提取：**

`extract` 命令对于纯数据提取更高效，因为它不使用代理步数。

它就像一个智能的 fetch，可以从动态渲染的网站中提取结构化数据：

```bash smooth start-session smooth extract -- <session-id> \ --url "https://news.ycombinator.com" \ --schema '{ "type": "object", "properties": { "posts": { "type": "array", "items": { "type": "object", "properties": { "title": {"type": "string"}, "url": {"type": "string"}, "points": {"type": "number"} } } } } }' \ --prompt "Extract the top 10 posts" ```

**何时使用哪种方式：** - 当你位于正确的页面或知道正确的 URL 并且只需要拉取结构化数据时，使用 `extract` - 当你需要代理在提取之前进行导航、交互或执行复杂操作时，使用 `run`

---

### 处理文件

**上传文件以在会话中使用：**

文件必须在启动会话之前上传，然后通过文件 ID 传递给会话：

```bash # Step 1: Upload files FILE_ID=$(smooth upload-file /path/to/document.pdf --purpose "Contract to analyze" --json | jq -r .file_id)

# Step 2: Start session with the file smooth start-session --files "$FILE_ID" --url "https://example.com"

# Step 3: The agent can now access the file in tasks smooth run -- <session-id> "Analyze the contract document and extract key terms" ```

**上传多个文件：** ```bash # Upload files FILE_ID_1=$(smooth upload-file /path/to/invoice.pdf --json | jq -r .file_id) FILE_ID_2=$(smooth upload-file /path/to/screenshot.png --json | jq -r .file_id)

# Start session with multiple files smooth start-session --files "$FILE_ID_1,$FILE_ID_2" ```

**从会话下载文件：** ```bash smooth run -- <session-id> "Download the monthly report PDF" --url smooth close-session -- <session-id>

# After session closes, get download URL smooth downloads -- <session-id> # Visit the URL to download files ```

---

### 实时视图与手动干预

当自动化需要人工输入（CAPTCHA、2FA、复杂的身份验证）时：

```bash smooth start-session --profile-id "my-profile" smooth run -- <session-id> "Go to secure-site.com and log in"

# If task encounters CAPTCHA or requires manual action: smooth live-view -- <session-id> # Open the URL and complete the manual steps

# Continue automation after manual intervention: smooth run -- <session-id> "Now navigate to the dashboard and export data" ```

---

### 直接浏览器操作

**从当前页面提取数据：**

```bash smooth start-session --url "https://example.com/products" smooth extract -- <session-id> \ --schema '{"type":"object","properties":{"products":{"type":"array"}}}' \ --prompt "Extract all product names and prices" ```

**导航到 URL 然后提取：**

```bash smooth extract -- <session-id> \ --url "https://example.com/products" \ --schema '{"type":"object","properties":{"products":{"type":"array"}}}' ```

**在浏览器中执行 JavaScript：**

```bash # Simple JavaScript smooth evaluate-js -- <session-id> "document.title"

# With arguments smooth evaluate-js -- <session-id> "(args) => {return args.x + args.y;}" --args '{"x": 5, "y": 10}'

# Complex DOM manipulation smooth evaluate-js -- <session-id> \ "document.querySelectorAll('a').length" ```

---

## 配置文件管理

**列出所有配置文件：** ```bash smooth list-profiles ```

**删除配置文件：** ```bash smooth delete-profile <profile-id> ```

**何时使用配置文件：** - ✅ 需要身份验证的网站 - ✅ 在多次任务运行之间维护会话状态 - ✅ 避免重复登录 - ✅ 保留 Cookie 和本地存储

**何时跳过配置文件：** - 不需要身份验证的公共网站 - 一次性抓取任务 - 测试场景

---

## 文件管理

**上传文件：** ```bash smooth upload-file /path/to/file.pdf --name "document.pdf" --purpose "Contract for review" ```

**删除文件：** ```bash smooth delete-file <file-id> ```

---

## 最佳实践

1. **始终保存会话 ID** - 你将在后续命令中需要它们 2. **为已认证的会话使用配置文件** - 跟踪哪个配置文件用于哪个网站 3. **关闭会话后等待 5 秒** - 确保状态已正确保存 4. **使用具有描述性的配置文件 ID** - 例如 "linkedin-personal", "twitter-company" 5. **完成后关闭会话** - 优雅关闭（默认）可确保正确清理 6. **使用结构化输出进行数据提取** - 提供清晰、类型化的结果 7. **在同一会话中运行顺序任务** - 当步骤依赖于之前的工作时，保持会话连续。 8. **为每个独立的任务使用一个会话搭配子代理** - 并行运行任务以加快工作速度。 9. **协调资源** - 在使用子代理时，你必须为每个子代理创建并分配一个部分，而不要让它们自己创建。 10. **不要将 URL 查询参数添加到 URL 中，例如避免使用 `?filter=xyz`** - 从基础 URL 开始，让代理通过导航 UI 来应用过滤器。 11. **Smooth 由智能代理驱动** - 给它任务，而不是单独的步骤。

---

## 故障排除

**"未找到会话"** - 会话可能已超时或已关闭。启动一个新会话。

**"未找到配置文件"** - 检查 `smooth list-profiles` 以查看可用的配置文件。

**CAPTCHA 或身份验证问题** - 使用 `smooth live-view -- <session-id>` 让用户手动干预。

**任务超时** - 增加 `--max-steps` 或将任务分解为更小的步骤。

---

## 命令参考

### 配置文件命令 - `smooth create-profile [--profile-id ID]` - 创建新配置文件 - `smooth list-profiles` - 列出所有配置文件 - `smooth delete-profile <profile-id>` - 删除配置文件

### 文件命令 - `smooth upload-file <path> [--name NAME] [--purpose PURPOSE]` - 上传文件 - `smooth delete-file <file-id>` - 删除已上传的文件

### 会话命令 - `smooth start-session [OPTIONS]` - 启动浏览器会话 - `smooth close-session -- <session-id> [--force]` - 关闭会话 - `smooth run -- <session-id> "<task>" [OPTIONS]` - 运行任务 - `smooth extract -- <session-id> --schema SCHEMA [OPTIONS]` - 提取结构化数据 - `smooth evaluate-js -- <session-id> "code" [--args JSON]` - 执行 JavaScript - `smooth live-view -- <session-id>` - 获取交互式实时 URL - `smooth recording-url -- <session-id>` - 获取录制 URL - `smooth downloads -- <session-id>` - 获取下载 URL

所有命令都支持 `--json` 标志以输出 JSON。

介绍

更多产品

Agent Browser

Brave Search

Desktop Control