Processing Pipeline
S3 to database. Automatically.
Watch your documents flow through our intelligent pipeline—from raw PDFs to structured, validated data in your database.
S3 Bucket
invoice_2024.pdf
OCR Processing
{}
Typed JSON
Zod validated
Insert
Database
PostgreSQL
Simple Integration
From S3 to database in one call.
CLI or API. Your choice.
1
import { processDocument } from 'ocr-use'
2
3
const result = await processDocument({
4
source: 's3://bucket/invoice.pdf',
5
schema: './invoice.schema.ts',
6
output: 'postgres://db'
7
})
Intelligent Routing
Best model for every document.
We automatically route each PDF through the optimal OCR engine based on document type and complexity.
invoice_2024.pdf
Gemini 2.5 Flash
Structured tables detected → Fast extraction
handwritten_form.pdf
PaddleOCR
Handwriting detected → Specialized model
research_paper.pdf
Docling
Complex layout → Document understanding
Built for AI startups
Processing 2-10k PDFs daily? We handle the complexity so you can focus on your product.
Zero Configuration
Scales instantly from 1 to 10,000 PDFs per day. No infrastructure to manage, no models to fine-tune.
Type-Safe Schemas
Define your data structure with Zod or OpenAPI. We validate every extraction and catch errors before they hit your database.
Token Pricing
Pay per token, not per page. Process a 100-page document or a 1-page receipt—you only pay for what you extract.