100,000+ pages already parsed
New 100+ free pages, every month

From document
to data. With anchors.

The parsing API for developers who don't have time to clean up messy output. One endpoint, every document, anchored JSON.

Get API Key — free See how it works
By the numbers
111K+
Parsed pages
100+ /mo
Free pages, forever
6+
Document formats
JSON + MD
Per-page output
bbox
Anchor on every field
Why Docule

Built for the
messy reality of real documents.

Source-anchored output. Always.

Every extracted field carries a page number and bounding box back to its origin. No black box, no hallucinations, no guessing where the number came from.

  • Page-level and bbox-level anchors on every value
  • Missing fields return null — never invented
  • Audit-ready output for compliance and review
14 / 92
Nordsea Holdings AB
Consolidated Income Statement
For the year ended 31 December 2025 · MSEK
(MSEK)Note20252024
Net sales225,75823,104
Cost of goods sold6(18,392)(16,847)
Gross profit7,3666,257
Operating expenses9(3,914)(3,512)
Operating profit3,4522,745
page 14 · bbox [340, 175, 410, 188]
Locale-aware

Numbers come out as numbers.

Nordic comma decimals. Continental dates. Mixed scripts. We normalise everything to canonical form, so your pipeline doesn't have to.

"25 758,00 SEK" 25758 SEK
"31.12.2025" 2025-12-31
"(3 452)" -3452
Schema validated

Get exactly the fields you ask for.

Bring your own JSON Schema. Docule fills it, validates it, and never invents data to fit. Missing values come back clean.

invoice_number "INV-2026-001"
total_amount 156.92
vat_rate 0.19
due_date null
valid ✓ schema OK
Layout intelligence

Tables that span pages. Footnotes 30 pages away.

Built on Nordic financial filings — the hardest parsing task there is. If we handle those, your invoices are easy.

14
Annual Report
Income Statement
Net sales¹25,758
COGS(18,392)
Gross profit7,366
OpEx(3,914)
EBIT3,452
↓ continues p. 15
15
↑ Income Statement (cont.)
(continued from p. 14)
Financial inc.+47
Tax²(817)
Net profit2,682
Earnings per share equivalent to SEK 4.32 (3.78 in 2024).
42
Notes to Financial Statements
Note 1 — Net sales
Subscription revenue (74%) and one-time license fees (26%). Geographic split: Nordics 58%, EU 31%, RoW 11%.
Note 2 — Tax — effective rate 23.4%, see deferred tax breakdown…
↔ Table reassembled across pp. 14–15 ¹ linked to Note 1 on p. 42
Multi-document splitting

One PDF. Fifty invoices. Detected and split.

A 200-page PDF holding 50 invoices comes back as 50 sub-documents — each with its own page range and title. Same JSON shape, no extra calls.

batch.pdf · 200 pages 50 sub-docs
invoice #1 · pp 1–4 { "invoice_number": ... }
invoice #50 · pp 197–200 { "invoice_number": ... }
For developers

A clean API. Five minutes to integrate.

One endpoint. Bring your own SDK or just curl it. Zero hidden behavior.

parse.sh
# Submit a document for parsing
curl -X POST https://docule.dev/api/v1/parse \
  -H "X-API-Key: $DOCULE_API_KEY" \
  -F "file=@report.pdf" \
  -F "schema_file=@income_statement.schema.json" \
  -G --data-urlencode "formats=json,md"

# → 200 OK · structured JSON with bbox anchors
Use cases

Works on the documents
you actually have.

01

Financial filings

Annual reports, 10-Ks, interim statements with multi-language footnotes and cross-page tables.

IFRS XBRL Nordic
02

Invoices & receipts

VAT-aware extraction across EU jurisdictions. Multi-line, multi-currency, with built-in schema validation.

EU VAT AP Multi-currency
03

Contracts & legal docs

Clause-level extraction with paragraph anchors. Bring your taxonomy, get structured fields back.

Clauses Anchors Compliance
04

Scanned forms

OCR with handwriting, checkboxes, and signatures. Auto-routed to vision when text extraction fails.

OCR Vision Forms
05

Scientific papers

Equations preserved as LaTeX, citations linked, figures captioned with bbox.

LaTeX Citations Figures
06

Technical docs

Documentation, manuals, spec sheets. Structure preserved for RAG and LLM agent pipelines.

RAG Markdown Agents
Pricing

Simple, credit-based pricing.

Start free with 100+ pages every month. No card required. Pay only for what you use — easier pages cost fewer credits.

Free
€0
forever
  • 6,000 credits / month
  • 100+ pages typical
  • Markdown + JSON output
  • 1 request / second
  • Community docs
Get started
Starter
€39
/ month
  • 40,000 credits included
  • 500+ pages
  • Then €0.001 / credit
  • Batch API access
  • 3 requests / second
  • Email support
Start trial
Business
€249
/ month
  • 500,000 credits included
  • 6,500+ pages
  • Then €0.0005 / credit
  • Batch API access
  • 20 requests / second
  • Priority support
  • Webhook notifications
  • SLA & dedicated support
Start trial
How credits work

One credit type. Three processing modes.

Docule routes each page to the cheapest path that still produces a clean, anchored output. You only pay extra credits when the page actually needs heavier processing.

text
8credits / page

Pure text-extractable PDFs. PyMuPDF only — no LLM call, no vision API.

smart · default
25credits / page

Tables + LLM validation + locale normalize + schema extraction. Most documents land here.

vision
75credits / page

Scanned PDFs, complex layouts, OCR. Vision API invoked automatically when needed.

A typical mix (60% text · 30% smart · 10% vision) averages ~17 credits/page.

Start parsing.
In thirty seconds.

100+ free pages every month, on the house. No card. No expiry.

Get your API key Read the docs