Why Docule

Built for the
messy reality of real documents.

Source-anchored output. Always.

Every extracted field carries a page number and bounding box back to its origin. No black box, no hallucinations, no guessing where the number came from.

Page-level and bbox-level anchors on every value
Missing fields return null — never invented
Audit-ready output for compliance and review

14 / 92

Nordsea Holdings AB

Consolidated Income Statement

For the year ended 31 December 2025 · MSEK

(MSEK)	Note	2025	2024
Net sales	2	25,758	23,104
Cost of goods sold	6	(18,392)	(16,847)
Gross profit		7,366	6,257
Operating expenses	9	(3,914)	(3,512)
Operating profit		3,452	2,745

page 14 · bbox [340, 175, 410, 188]

Locale-aware

Numbers come out as numbers.

Nordic comma decimals. Continental dates. Mixed scripts. We normalise everything to canonical form, so your pipeline doesn't have to.

"25 758,00 SEK" → 25758 SEK

"31.12.2025" → 2025-12-31

"(3 452)" → -3452

Schema validated

Get exactly the fields you ask for.

Bring your own JSON Schema. Docule fills it, validates it, and never invents data to fit. Missing values come back clean.

invoice_number "INV-2026-001"

total_amount 156.92

vat_rate 0.19

due_date null

valid ✓ schema OK

Layout intelligence

Tables that span pages. Footnotes 30 pages away.

Built on Nordic financial filings — the hardest parsing task there is. If we handle those, your invoices are easy.

Annual Report

Income Statement

Net sales^¹25,758

COGS(18,392)

Gross profit7,366

OpEx(3,914)

EBIT3,452

↓ continues p. 15

↑ Income Statement (cont.)

(continued from p. 14)

Financial inc.+47

Tax^²(817)

Net profit2,682

Earnings per share equivalent to SEK 4.32 (3.78 in 2024).

Notes to Financial Statements

Note 1 — Net sales

Subscription revenue (74%) and one-time license fees (26%). Geographic split: Nordics 58%, EU 31%, RoW 11%.

Note 2 — Tax — effective rate 23.4%, see deferred tax breakdown…

↔ Table reassembled across pp. 14–15 ¹ linked to Note 1 on p. 42

Multi-document splitting

One PDF. Fifty invoices. Detected and split.

A 200-page PDF holding 50 invoices comes back as 50 sub-documents — each with its own page range and title. Same JSON shape, no extra calls.

batch.pdf · 200 pages → 50 sub-docs

invoice #1 · pp 1–4 → { "invoice_number": ... }

invoice #50 · pp 197–200 → { "invoice_number": ... }

For developers

A clean API. Five minutes to integrate.

One endpoint. Bring your own SDK or just curl it. Zero hidden behavior.

parse.sh

# Submit a document for parsing
curl -X POST https://docule.dev/api/v1/parse \
  -H "X-API-Key: $DOCULE_API_KEY" \
  -F "file=@report.pdf" \
  -F "schema_file=@income_statement.schema.json" \
  -G --data-urlencode "formats=json,md"

# → 200 OK · structured JSON with bbox anchors

Use cases

Works on the documents
you actually have.

Financial filings

Annual reports, 10-Ks, interim statements with multi-language footnotes and cross-page tables.

IFRS XBRL Nordic

Invoices & receipts

VAT-aware extraction across EU jurisdictions. Multi-line, multi-currency, with built-in schema validation.

EU VAT AP Multi-currency

Contracts & legal docs

Clause-level extraction with paragraph anchors. Bring your taxonomy, get structured fields back.

Clauses Anchors Compliance

Scanned forms

OCR with handwriting, checkboxes, and signatures. Auto-routed to vision when text extraction fails.

OCR Vision Forms

Scientific papers

Equations preserved as LaTeX, citations linked, figures captioned with bbox.

LaTeX Citations Figures

Technical docs

Documentation, manuals, spec sheets. Structure preserved for RAG and LLM agent pipelines.

RAG Markdown Agents

Pricing

Simple, credit-based pricing.

Start free with 100+ pages every month. No card required. Pay only for what you use — easier pages cost fewer credits.

Free

€0

forever

6,000 credits / month
100+ pages typical
Markdown + JSON output
1 request / second
Community docs

Get started

Starter

€39

/ month

40,000 credits included
500+ pages
Then €0.001 / credit
Batch API access
3 requests / second
Email support

Start trial

One credit type. Three processing modes.

Docule routes each page to the cheapest path that still produces a clean, anchored output. You only pay extra credits when the page actually needs heavier processing.

text

8credits / page

Pure text-extractable PDFs. PyMuPDF only — no LLM call, no vision API.

smart · default

25credits / page

Tables + LLM validation + locale normalize + schema extraction. Most documents land here.

vision

75credits / page

Scanned PDFs, complex layouts, OCR. Vision API invoked automatically when needed.

A typical mix (60% text · 30% smart · 10% vision) averages ~17 credits/page.

From document
to data. With anchors.

Built for the
messy reality of real documents.

Source-anchored output. Always.

Numbers come out as numbers.

Get exactly the fields you ask for.

Tables that span pages. Footnotes 30 pages away.

One PDF. Fifty invoices. Detected and split.

A clean API. Five minutes to integrate.

Works on the documents
you actually have.

Financial filings

Invoices & receipts

Contracts & legal docs

Scanned forms

Scientific papers

Technical docs

Simple, credit-based pricing.

One credit type. Three processing modes.

Start parsing.
In thirty seconds.

From document to data. With anchors.

Built for themessy reality of real documents.

Source-anchored output. Always.

Numbers come out as numbers.

Get exactly the fields you ask for.

Tables that span pages. Footnotes 30 pages away.

One PDF. Fifty invoices. Detected and split.

A clean API. Five minutes to integrate.

Works on the documentsyou actually have.

Financial filings

Invoices & receipts

Contracts & legal docs

Scanned forms

Scientific papers

Technical docs

Simple, credit-based pricing.

One credit type. Three processing modes.

Start parsing.In thirty seconds.

From document
to data. With anchors.

Built for the
messy reality of real documents.

Works on the documents
you actually have.

Start parsing.
In thirty seconds.