ClawSkills logoClawSkills

DeepRead OCR

AI 原生 OCR 平台,可在几分钟内将文档转化为高精度数据。DeepRead 利用多模型共识实现了 97% 以上的准确率,并仅标记

介绍

# DeepRead - Production OCR API

DeepRead 是一个 AI 原生的 OCR 平台,可在几分钟内将文档转换为高精度的数据。通过多模型共识机制,DeepRead 实现了 97% 以上的准确率,并仅对不确定的字段进行人机协同(HIL)审查——将人工工作量从 100% 减少至 5-10%。无需零提示词工程。

## 此技能的功能

DeepRead 是一个生产级文档处理 API,可在几分钟内提供高精度的结构化数据输出,并配备人工审查标记功能,使人工审查仅限于被标记的异常情况。

**核心功能:** - **文本提取**:将 PDF 和图像转换为干净的 Markdown - **结构化数据**:提取带有置信度分数的 JSON 字段 - **HIL 界面**:内置人机协同审查功能——不确定的字段会被标记(`hil_flag`),因此只有异常情况需要人工审查 - **多重处理**:多次验证通过以实现最高准确率 - **多模型共识**:模型之间交叉验证以确保可靠性 - **免费层级**:每月 2,000 页(无需信用卡)

## 设置

### 1. 获取您的 API 密钥

注册并创建一个 API 密钥: ```bash # Visit the dashboard https://www.deepread.tech/dashboard

# Or use this direct link https://www.deepread.tech/dashboard/?utm_source=clawdhub ```

保存您的 API 密钥: ```bash export DEEPREAD_API_KEY="sk_live_your_key_here" ```

### 2. Clawdbot 配置(可选)

添加到您的 `clawdbot.config.json5`: ```json5 { skills: { entries: { "deepread": { enabled: true // API key is read from DEEPREAD_API_KEY environment variable // Do NOT hardcode your API key here } } } } ```

### 3. 处理您的第一个文档

**选项 A:使用 Webhook(推荐)** ```bash # Upload PDF with webhook notification curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "webhook_url=https://your-app.com/webhooks/deepread"

# Returns immediately { "id": "550e8400-e29b-41d4-a716-446655440000", "status": "queued" }

# Your webhook receives results when processing completes (2-5 minutes) ```

**选项 B:轮询结果** ```bash # Upload PDF without webhook curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]"

# Returns immediately { "id": "550e8400-e29b-41d4-a716-446655440000", "status": "queued" }

# Poll until completed curl https://api.deepread.tech/v1/jobs/550e8400-e29b-41d4-a716-446655440000 \ -H "X-API-Key: $DEEPREAD_API_KEY" ```

## 使用示例

### 基础 OCR(仅文本)

将文本提取为干净的 Markdown:

```bash # With webhook (recommended) curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "webhook_url=https://your-app.com/webhook"

# OR poll for completion curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]"

# Then poll curl https://api.deepread.tech/v1/jobs/JOB_ID \ -H "X-API-Key: $DEEPREAD_API_KEY" ```

**完成时的响应:** ```json { "id": "550e8400-...", "status": "completed", "result": { "text": "# INVOICE\n\n**Vendor:** Acme Corp\n**Total:** $1,250.00..." } } ```

### 结构化数据提取

提取特定字段并进行置信度评分:

```bash curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F 'schema={ "type": "object", "properties": { "vendor": { "type": "string", "description": "Vendor company name" }, "total": { "type": "number", "description": "Total invoice amount" }, "invoice_date": { "type": "string", "description": "Invoice date in MM/DD/YYYY format" } } }' ```

**响应包含置信度标记:** ```json { "status": "completed", "result": { "text": "# INVOICE\n\n**Vendor:** Acme Corp...", "data": { "vendor": { "value": "Acme Corp", "hil_flag": false, "found_on_page": 1 }, "total": { "value": 1250.00, "hil_flag": false, "found_on_page": 1 }, "invoice_date": { "value": "2024-10-??", "hil_flag": true, "reason": "Date partially obscured", "found_on_page": 1 } }, "metadata": { "fields_requiring_review": 1, "total_fields": 3, "review_percentage": 33.3 } } } ```

### 复杂架构(嵌套数据)

提取数组和嵌套对象:

```bash curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F 'schema={ "type": "object", "properties": { "vendor": {"type": "string"}, "total": {"type": "number"}, "line_items": { "type": "array", "items": { "type": "object", "properties": { "description": {"type": "string"}, "quantity": {"type": "number"}, "price": {"type": "number"} } } } } }' ```

### 逐页分解

获取带有质量标记的逐页 OCR 结果:

```bash curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "include_pages=true" ```

**响应:** ```json { "result": { "text": "Combined text from all pages...", "pages": [ { "page_number": 1, "text": "# Contract Agreement\n\n...", "hil_flag": false }, { "page_number": 2, "text": "Terms and C??diti??s...", "hil_flag": true, "reason": "Multiple unrecognized characters" } ], "metadata": { "pages_requiring_review": 1, "total_pages": 2 } } } ```

## 何时使用此技能

### ✅ 适用于:

- **发票处理**:提取供应商、总额、行项目 - **收据 OCR**:解析商家、商品、总额 - **合同分析**:提取当事方、日期、条款 - **表单数字化**:将纸质表单转换为结构化数据 - **文档工作流**:任何需要 OCR + 数据提取的流程 - **质量关键型应用**:当您需要知道哪些提取结果不确定时

### ❌ 不适用于:

- **实时处理**:处理需要 2-5 分钟(异步工作流) - **批量 >2,000 页/月**:升级到 PRO 或 SCALE 级别

## 工作原理

### 多重处理流水线

``` PDF → Convert → Rotate Correction → OCR → Multi-Model Validation → Extract → Done ```

该流水线自动处理: - 文档旋转和方向校正 - 多重验证以确保准确性 - 跨模型共识以确保可靠性 - 字段级置信度评分

### 人机协同(HIL)界面

DeepRead 包含一个内置的人机协同(HIL)审查系统。AI 会将提取的文本与原始图像进行比较,并在每个字段上设置 `hil_flag`:

- **`hil_flag: false`** = 清晰、自信的提取 → 自动处理 - **`hil_flag: true`** = 不确定的提取 → 路由至人工审查

**HIL 的工作方式:** 1. 以高置信度提取的字段将自动批准 2. 不确定的字段会被标记 `hil_flag: true` 和一个 `reason` 3. 只有被标记的字段需要人工审查(通常占总字段的 5-10%) 4. 在 **DeepRead Preview**(`preview.deepread.tech`)中审查被标记的字段——这是一个专用的 HIL 审查界面,审查人员可以并排查看原始文档和提取的数据,更正被标记的字段并批准结果 5. 或者,使用 API 响应中的 `hil_flag` 数据集成到您自己的审查队列中

**AI 在以下情况标记提取:** - 文本是手写、模糊或质量较差的 - 存在多种可能的解释 - 字符部分可见或不清楚 - 文档中未找到该字段

**这是多模态 AI 判定,而非基于规则。**

## 高级功能

### 1. 蓝图(优化的架构)

为特定文档类型创建可重用、优化的架构:

```bash # List your blueprints curl https://api.deepread.tech/v1/blueprints \ -H "X-API-Key: $DEEPREAD_API_KEY"

# Use blueprint instead of inline schema curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "blueprint_id=660e8400-e29b-41d4-a716-446655440001" ```

**优势:** - 比基础架构准确率提高 20-30% - 可在类似文档中重用 - 带有版本控制并支持回滚

**如何创建蓝图:**

```bash # Create a blueprint from training data curl -X POST https://api.deepread.tech/v1/optimize \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "utility_invoice", "description": "Optimized for utility invoices", "document_type": "invoice", "initial_schema": { "type": "object", "properties": { "vendor": {"type": "string", "description": "Vendor name"}, "total": {"type": "number", "description": "Total amount"} } }, "training_documents": ["doc1.pdf", "doc2.pdf", "doc3.pdf"], "ground_truth_data": [ {"vendor": "Acme Power", "total": 125.50}, {"vendor": "City Electric", "total": 89.25} ], "target_accuracy": 95.0, "max_iterations": 5 }'

# Returns: {"job_id": "...", "blueprint_id": "...", "status": "pending"}

# Check optimization status curl https://api.deepread.tech/v1/blueprints/jobs/JOB_ID \ -H "X-API-Key: $DEEPREAD_API_KEY"

# Use blueprint (once completed) curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "blueprint_id=BLUEPRINT_ID" ```

### 2. Webhooks(推荐用于生产环境)

在处理完成时收到通知,而无需轮询:

```bash curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "webhook_url=https://your-app.com/webhooks/deepread" ```

**处理完成时,您的 webhook 会收到此负载:** ```json { "job_id": "550e8400-...", "status": "completed", "created_at": "2025-01-27T10:00:00Z", "completed_at": "2025-01-27T10:02:30Z", "result": { "text": "...", "data": {...} }, "preview_url": "https://preview.deepread.tech/abc1234" } ```

**优势:** - 无需轮询 - 完成时即时通知 - 更低延迟 - 更适合生产工作流

### 3. Preview(HIL 审查界面)

DeepRead Preview(`preview.deepread.tech`)是内置的人机协同审查界面。审查人员可以并排查看原始文档和提取的数据,更正被标记的字段并批准结果。Preview URL 也可以在没有身份验证的情况下共享:

```bash # Request preview URL curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "include_images=true"

# Get preview URL in response { "result": { "text": "...", "data": {...} }, "preview_url": "https://preview.deepread.tech/Xy9aB12" } ```

**公共预览端点:** ```bash # No authentication required curl https://api.deepread.tech/v1/preview/Xy9aB12 ```

## 速率限制与定价

### 免费层级(无需信用卡) - **2,000 页/月** - **10 请求/分钟** - 完整功能访问(OCR + 结构化提取 + 蓝图)

### 付费计划 - **PRO**:50,000 页/月,100 请求/分钟,99 美元/月 - **SCALE**:自定义批量定价(联系销售)

**升级:** https://www.deepread.tech/dashboard/billing?utm_source=clawdhub

### 速率限制响应头

每个响应都包含配额信息: ``` X-RateLimit-Limit: 2000 X-RateLimit-Remaining: 1847 X-RateLimit-Used: 153 X-RateLimit-Reset: 1730419200 ```

## 最佳实践

### 1. 在生产环境中使用 Webhooks

**✅ 推荐:Webhook 通知** ```bash curl -X POST https://api.deepread.tech/v1/process \ -H "X-API-Key: $DEEPREAD_API_KEY" \ -F "[email protected]" \ -F "webhook_url=https://your-app.com/webhook" ```

**仅在以下情况使用轮询:** - 测试/开发 - 无法暴露 Webhook 端点 - 需要同步响应

### 2. 架构设计

**✅ 好:描述性的字段描述** ```json { "vendor": { "type": "string", "description": "Vendor company name. Usually in header or top-left of invoice." } } ```

**❌ 坏:无描述** ```json { "vendor": {"type": "string"} } ```

### 3. 轮询策略(如果需要)

仅当您无法使用 Webhooks 时,每 5-10 秒轮询一次:

```python import time import requests

def wait_for_result(job_id, api_key): while True: response = requests.get( f"https://api.deepread.tech/v1/jobs/{job_id}", headers={"X-API-Key": api_key} ) result = response.json()

if result["status"] == "completed": return result["result"] elif result["status"] == "failed": raise Exception(f"Job failed: {result.get('error')}")

time.sleep(5) ```

### 4. 处理质量标记

将确定的字段与不确定的字段分开:

```python def process_extraction(data): confident = {} needs_review = []

for field, field_data in data.items(): if field_data["hil_flag"]: needs_review.append({ "field": field, "value": field_data["value"], "reason": field_data.get("reason") }) else: confident[field] = field_data["value"]

# Auto-process confident fields save_to_database(confident)

# Send uncertain fields to review queue if needs_review: send_to_review_queue(needs_review) ```

## 故障排除

### 错误:`quota_exceeded` ```json {"detail": "Monthly page quota exceeded"} ``` **解决方案:** 升级到 PRO 或等到下一个计费周期。

### 错误:`invalid_schema` ```json {"detail": "Schema must be valid JSON Schema"} ``` **解决方案:** 确保架构是有效的 JSON,并且包含 `type` 和 `properties`。

### 错误:`file_too_large` ```json {"detail": "File size exceeds 50MB limit"} ``` **解决方案:** 压缩 PDF 或拆分为较小的文件。

### 任务状态:`failed` ```json {"status": "failed", "error": "PDF could not be processed"} ``` **常见原因:** - PDF 文件已损坏 - 受密码保护的 PDF - 不支持的 PDF 版本 - 图像质量太低,无法进行 OCR

## 示例架构模板

### 发票架构 ```json { "type": "object", "properties": { "invoice_number": { "type": "string", "description": "Unique invoice ID" }, "invoice_date": { "type": "string", "description": "Invoice date in MM/DD/YYYY format" }, "vendor": { "type": "string", "description": "Vendor company name" }, "total": { "type": "number", "description": "Total amount due including tax" }, "line_items": { "type": "array", "items": { "type": "object", "properties": { "description": {"type": "string"}, "quantity": {"type": "number"}, "price": {"type": "number"} } } } } } ```

### 收据架构 ```json { "type": "object", "properties": { "merchant": { "type": "string", "description": "Store or merchant name" }, "date": { "type": "string", "description": "Transaction date" }, "total": { "type": "number", "description": "Total amount paid" }, "items": { "type": "array", "items": { "type": "object", "properties": { "name": {"type": "string"}, "price": {"type": "number"} } } } } } ```

### 合同架构 ```json { "type": "object", "properties": { "parties": { "type": "array", "items": {"type": "string"}, "description": "Names of all parties in the contract" }, "effective_date": { "type": "string", "description": "Contract start date" }, "term_length": { "type": "string", "description": "Duration of contract" }, "termination_clause": { "type": "string", "description": "Conditions for termination" } } } ```

## 支持与资源

- **GitHub**:https://github.com/deepread-tech - **问题**:https://github.com/deepread-tech/deep-read-service/issues - **电子邮件**: [email protected]

### 重要说明 - **处理时间**:2-5 分钟(异步,非实时) - **异步工作流**:使用 Webhooks(推荐)或轮询 - **速率限制**:免费层级为 10 请求/分钟 - **文件大小限制**:每个文件 50MB - **支持的格式**:PDF, JPG, JPEG, PNG

---

**准备开始了吗?** 在 https://www.deepread.tech/dashboard/?utm_source=clawdhub 获取您的免费 API 密钥

更多产品