ocr-use
Announcement
ocr-use v1.0 launches on 24th Nov 2025 on AI DOES Krakow.RSVP now

Build and deploy OCR pipelines.

ocr-use is a developer platform that makes turning PDFs into structured data in lightning speed.

Processing Pipeline
S3 to database. Automatically.
Watch your documents flow through our intelligent pipeline—from raw PDFs to structured, validated data in your database.
S3 Bucket
invoice_2024.pdf
OCR Processing
{}
Typed JSON
Zod validated
Insert
Database
PostgreSQL
Simple Integration
From S3 to database in one call.
CLI or API. Your choice.
1
import { processDocument } from 'ocr-use'
2
3
const result = await processDocument({
4
source: 's3://bucket/invoice.pdf',
5
schema: './invoice.schema.ts',
6
output: 'postgres://db'
7
})
Intelligent Routing
Best model for every document.
We automatically route each PDF through the optimal OCR engine based on document type and complexity.
invoice_2024.pdf
Gemini 2.5 Flash
Structured tables detected → Fast extraction
handwritten_form.pdf
PaddleOCR
Handwriting detected → Specialized model
research_paper.pdf
Docling
Complex layout → Document understanding

Built for AI startups

Processing 2-10k PDFs daily? We handle the complexity so you can focus on your product.

Zero Configuration
Scales instantly from 1 to 10,000 PDFs per day. No infrastructure to manage, no models to fine-tune.
Type-Safe Schemas
Define your data structure with Zod or OpenAPI. We validate every extraction and catch errors before they hit your database.
Token Pricing
Pay per token, not per page. Process a 100-page document or a 1-page receipt—you only pay for what you extract.
ocr-use

Document extraction infrastructure for developers.

GitHubX

Product

  • Features
  • Pricing
  • API Reference
  • Integrations
  • Changelog

Resources

  • Documentation
  • Guides
  • Blog
  • Community
  • Support

Company

  • About
  • Careers
  • Privacy
  • Terms
  • Contact

© 2025 ocr-use. All rights reserved.

StatusSecurityLegal