Stealth Browser

介绍

# Stealth Browser Automation

静默、不可检测的 Web 自动化，结合多层反检测机制。

## 快速登录流程 (重要)

当用户要求登录任何网站时：

1. **在有头模式 (headed mode) 下打开**（可见浏览器，用于手动登录）： ```bash python scripts/stealth_session.py -u "https://target.com/login" -s sitename --headed ```

2. **用户手动登录**在可见浏览器中

3. **保存会话**登录确认后： ```bash python scripts/stealth_session.py -u "https://target.com" -s sitename --headed --save ```

4. **未来使用** - 加载保存的会话（无头模式/headless）： ```bash python scripts/stealth_session.py -u "https://target.com" -s sitename --load ```

会话存储位置：`~/.clawdbot/browser-sessions/<sitename>.json`

## 执行策略 (重要)

### 1. 先静默后显示 - 优先使用 headless 模式静默尝试 - 如果失败或需要验证码，再切换到 headed 显示模式 - 避免打扰用户操作

### 2. 断点续传长任务使用 `task_runner.py` 管理状态： ```python from task_runner import TaskRunner task = TaskRunner('my_task') task.set_total(100) for i in items: if task.is_completed(i): continue # 跳过已完成 # 处理... task.mark_completed(i) task.finish() ```

### 3. 超时处理 - 默认单页超时: 30秒 - 长任务每50项保存一次进度 - 失败自动重试3次

### 4. 记录尝试所有登录尝试记录在: `~/.clawdbot/browser-sessions/attempts.json`

## 架构

``` ┌─────────────────────────────────────────────────────┐ │ Stealth Browser │ ├─────────────────────────────────────────────────────┤ │ Layer 1: Anti-Detection Engine │ │ - puppeteer-extra-plugin-stealth │ │ - Browser fingerprint spoofing │ │ - WebGL/Canvas/Audio fingerprint masking │ ├─────────────────────────────────────────────────────┤ │ Layer 2: Challenge Bypass │ │ - Cloudflare Turnstile/JS Challenge │ │ - hCaptcha / reCAPTCHA integration │ │ - 2Captcha / Anti-Captcha API │ ├─────────────────────────────────────────────────────┤ │ Layer 3: Session Persistence │ │ - Cookie storage (JSON/SQLite) │ │ - localStorage sync │ │ - Multi-profile management │ ├─────────────────────────────────────────────────────┤ │ Layer 4: Proxy & Identity │ │ - Rotating residential proxies │ │ - User-Agent rotation │ │ - Timezone/Locale spoofing │ └─────────────────────────────────────────────────────┘ ```

## 安装

### 安装核心依赖

```bash npm install -g puppeteer-extra puppeteer-extra-plugin-stealth npm install -g playwright pip install undetected-chromedriver DrissionPage ```

### 可选：验证码 (CAPTCHA) 解决器

将 API 密钥存储在 `~/.clawdbot/secrets/captcha.json`： ```json { "2captcha": "YOUR_2CAPTCHA_KEY", "anticaptcha": "YOUR_ANTICAPTCHA_KEY", "capsolver": "YOUR_CAPSOLVER_KEY" } ```

### 可选：代理配置

存储在 `~/.clawdbot/secrets/proxies.json`： ```json { "rotating": "http://user:[email protected]:port", "residential": ["socks5://ip1:port", "socks5://ip2:port"], "datacenter": "http://dc-proxy:port" } ```

## 快速开始

### 1. 隐身会话 (Python - 推荐)

```python # scripts/stealth_session.py - use for maximum compatibility import undetected_chromedriver as uc from DrissionPage import ChromiumPage

# Option A: undetected-chromedriver (Selenium-based) driver = uc.Chrome(headless=True, use_subprocess=True) driver.get("https://nowsecure.nl") # Test anti-detection

# Option B: DrissionPage (faster, native Python) page = ChromiumPage() page.get("https://cloudflare-protected-site.com") ```

### 2. 隐身会话 (Node.js)

```javascript // scripts/stealth.mjs import puppeteer from 'puppeteer-extra'; import StealthPlugin from 'puppeteer-extra-plugin-stealth';

puppeteer.use(StealthPlugin());

const browser = await puppeteer.launch({ headless: 'new', args: [ '--disable-blink-features=AutomationControlled', '--disable-dev-shm-usage', '--no-sandbox' ] });

const page = await browser.newPage(); await page.goto('https://bot.sannysoft.com'); // Verify stealth ```

## 核心操作

### 打开隐身页面

```bash # Using agent-browser with stealth profile agent-browser --profile ~/.stealth-profile open https://target.com

# Or via script python scripts/stealth_open.py --url "https://target.com" --headless ```

### 绕过 Cloudflare

```python # Automatic CF bypass with DrissionPage from DrissionPage import ChromiumPage

page = ChromiumPage() page.get("https://cloudflare-site.com") # DrissionPage waits for CF challenge automatically

# Manual wait if needed page.wait.ele_displayed("main-content", timeout=30) ```

对于顽固的 Cloudflare 网站，使用 FlareSolverr：

```bash # Start FlareSolverr container docker run -d --name flaresolverr -p 8191:8191 ghcr.io/flaresolverr/flaresolverr

# Request clearance curl -X POST http://localhost:8191/v1 \ -H "Content-Type: application/json" \ -d '{"cmd":"request.get","url":"https://cf-protected.com","maxTimeout":60000}' ```

### 解决验证码 (CAPTCHAs)

```python # scripts/solve_captcha.py import requests import json import time

def solve_recaptcha(site_key, page_url, api_key): """Solve reCAPTCHA v2/v3 via 2Captcha""" # Submit task resp = requests.post("http://2captcha.com/in.php", data={ "key": api_key, "method": "userrecaptcha", "googlekey": site_key, "pageurl": page_url, "json": 1 }).json() task_id = resp["request"] # Poll for result for _ in range(60): time.sleep(3) result = requests.get(f"http://2captcha.com/res.php?key={api_key}&action=get&id={task_id}&json=1").json() if result["status"] == 1: return result["request"] # Token return None

def solve_hcaptcha(site_key, page_url, api_key): """Solve hCaptcha via Anti-Captcha""" resp = requests.post("https://api.anti-captcha.com/createTask", json={ "clientKey": api_key, "task": { "type": "HCaptchaTaskProxyless", "websiteURL": page_url, "websiteKey": site_key } }).json() task_id = resp["taskId"] for _ in range(60): time.sleep(3) result = requests.post("https://api.anti-captcha.com/getTaskResult", json={ "clientKey": api_key, "taskId": task_id }).json() if result["status"] == "ready": return result["solution"]["gRecaptchaResponse"] return None ```

### 持久化会话

```python # scripts/session_manager.py import json import os from pathlib import Path

SESSIONS_DIR = Path.home() / ".clawdbot" / "browser-sessions" SESSIONS_DIR.mkdir(parents=True, exist_ok=True)

def save_cookies(driver, session_name): """Save cookies to JSON""" cookies = driver.get_cookies() path = SESSIONS_DIR / f"{session_name}_cookies.json" path.write_text(json.dumps(cookies, indent=2)) return path

def load_cookies(driver, session_name): """Load cookies from saved session""" path = SESSIONS_DIR / f"{session_name}_cookies.json" if path.exists(): cookies = json.loads(path.read_text()) for cookie in cookies: driver.add_cookie(cookie) return True return False

def save_local_storage(page, session_name): """Save localStorage""" ls = page.evaluate("() => JSON.stringify(localStorage)") path = SESSIONS_DIR / f"{session_name}_localStorage.json" path.write_text(ls) return path

def load_local_storage(page, session_name): """Restore localStorage""" path = SESSIONS_DIR / f"{session_name}_localStorage.json" if path.exists(): data = path.read_text() page.evaluate(f"(data) => {{ Object.entries(JSON.parse(data)).forEach(([k,v]) => localStorage.setItem(k,v)) }}", data) return True return False ```

### 静默自动化工作流

```python # Complete silent automation example from DrissionPage import ChromiumPage, ChromiumOptions

# Configure for stealth options = ChromiumOptions() options.headless() options.set_argument('--disable-blink-features=AutomationControlled') options.set_argument('--disable-dev-shm-usage') options.set_user_agent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36')

page = ChromiumPage(options)

# Navigate with CF bypass page.get("https://target-site.com")

# Wait for any challenges page.wait.doc_loaded()

# Interact silently page.ele("@id=username").input("[email protected]") page.ele("@id=password").input("password123") page.ele("@type=submit").click()

# Save session for reuse page.cookies.save("~/.clawdbot/browser-sessions/target-site.json") ```

## 代理轮换

```python # scripts/proxy_rotate.py import random import json from pathlib import Path

def get_proxy(): """Get random proxy from pool""" config = json.loads((Path.home() / ".clawdbot/secrets/proxies.json").read_text()) proxies = config.get("residential", []) return random.choice(proxies) if proxies else config.get("rotating")

# Use with DrissionPage options = ChromiumOptions() options.set_proxy(get_proxy()) page = ChromiumPage(options) ```

## 需要用户输入

要完成此技能，请提供：

1. **验证码 (CAPTCHA) API 密钥**（可选但推荐）： - 2Captcha 密钥：https://2captcha.com - Anti-Captcha 密钥：https://anti-captcha.com - CapSolver 密钥：https://capsolver.com

2. **代理配置**（可选）： - 住宅代理提供商凭据 - 或 SOCKS5/HTTP 代理列表

3. **目标站点**（用于预配置会话）： - 哪些站点需要保持登录持久化？ - 应存储哪些凭据？

## 文件结构

``` stealth-browser/ ├── SKILL.md ├── scripts/ │ ├── stealth_session.py # Main stealth browser wrapper │ ├── solve_captcha.py # CAPTCHA solving utilities │ ├── session_manager.py # Cookie/localStorage persistence │ ├── proxy_rotate.py # Proxy rotation │ └── cf_bypass.py # Cloudflare-specific bypass └── references/ ├── fingerprints.md # Browser fingerprint details └── detection-tests.md # Sites to test anti-detection ```

## 测试反检测

```bash # Run these to verify stealth is working: python scripts/stealth_open.py --url "https://bot.sannysoft.com" python scripts/stealth_open.py --url "https://nowsecure.nl" python scripts/stealth_open.py --url "https://arh.antoinevastel.com/bots/areyouheadless" python scripts/stealth_open.py --url "https://pixelscan.net" ```

## 与 agent-browser 集成

对于简单任务，请使用带有持久化配置文件的 agent-browser：

```bash # Create stealth profile once agent-browser --profile ~/.stealth-profile --headed open https://login-site.com # Login manually, then close

# Reuse authenticated session (headless) agent-browser --profile ~/.stealth-profile snapshot agent-browser --profile ~/.stealth-profile click @e5 ```

对于 Cloudflare 或验证码较多的站点，请改用 Python 脚本。

## 最佳实践

1. **始终使用 `headless: 'new'`** 而不是 `headless: true`（更难检测） 2. **轮换 User-Agents** 以匹配浏览器版本 3. **在操作之间添加随机延迟**（100-500ms） 4. **针对敏感目标使用住宅代理** 5. **成功登录后保存会话** 6. **在生产使用前在 bot.sannysoft.com 上测试**

介绍

更多产品

Agent Browser

Brave Search

Desktop Control