Parse any business document into structured JSON — contracts, forms, reports, policies, and more. Send a document, define the fields you need, and get clean data back. No templates, no training, no layout configuration.
Send a document to the extraction endpoint with your schema, and receive structured JSON. Works with contracts, reports, forms, and any business document — no templates or pre-configuration required.
curl -X POST \
https://api-parse.conversiontools.io/v1/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@contract.pdf" \
-F 'schema={
"parties": [{
"name": "string",
"role": "string"
}],
"effective_date": "string",
"termination_date": "string",
"governing_law": "string",
"clauses": [{
"title": "string",
"summary": "string"
}]
}'{
"status": "completed",
"data": {
"parties": [
{
"name": "Acme Corp",
"role": "Service Provider"
},
{
"name": "Globex Inc",
"role": "Client"
}
],
"effective_date": "2026-01-15",
"termination_date": "2027-01-14",
"governing_law": "State of Delaware",
"clauses": [
{
"title": "Confidentiality",
"summary": "Both parties agree to keep proprietary information confidential for 3 years after termination."
},
{
"title": "Limitation of Liability",
"summary": "Total liability is capped at the fees paid in the 12 months preceding the claim."
}
]
}
}Three steps to go from any document to structured data. No training, no templates — the API understands your documents automatically.
Upload any business document — contracts, reports, forms, policies, or certificates. Supports PDF, JPEG, PNG, WebP, and TIFF formats.
The API reads every page, understands the document structure, and extracts the fields defined in your schema — tables, lists, dates, and nested data included.
Receive clean, typed JSON matching your schema. Ready to store in a database, feed into a pipeline, or display in your application.
Built for developers who need to extract structured data from documents without building custom parsers for each layout.
No need to build separate parsers for each document format. The AI understands headers, paragraphs, columns, and mixed layouts automatically.
Process entire documents in a single API call. The parser maintains context across pages, extracting data that spans headers, sections, and appendices.
Automatically recognizes and extracts tabular data, bulleted lists, numbered clauses, and nested structures into typed JSON arrays.
Extracts printed, typed, and handwritten text from scanned documents. Handles annotations, filled-in forms, and signatures on business documents.
Parse documents in any language. The AI understands document structure regardless of language, supporting multilingual documents and mixed-language content.
Process large volumes of documents using the asynchronous endpoint. Submit batches and poll for results, or use webhooks to get notified on completion.
The same API works across all your document processing needs. Define a schema once, parse thousands of documents.
Extract vendor, amounts, line items, dates, and tax details from invoices automatically.
Parse store names, totals, items, and payment methods from receipts and POS printouts.
Extract structured JSON data from any document with custom schemas and AI-powered understanding.
Convert any PDF into structured data. Works with digital and scanned PDFs across all languages.
Common questions about the document parsing API.
Parse handles virtually any document type including contracts, agreements, reports, forms, letters, memos, policies, and certificates. It accepts PDF, JPEG, PNG, WebP, and TIFF files. The AI understands document structure regardless of layout, so you do not need separate configurations for each document type.
Yes. Parse processes multi-page documents as a single unit, maintaining context across pages. A 20-page contract is parsed in one API call, and the extracted data spans all pages — for example, parties mentioned on page 1 and clauses from pages 5 through 18 are returned together in a single JSON response.
The AI recognizes tabular data, bulleted lists, numbered lists, and nested structures within documents. Define array fields in your schema, and Parse extracts each row or item as a structured object. This works for financial tables, clause lists, line items, and any repeating data pattern.
Parse can extract printed and typed text with high accuracy. Handwritten text recognition depends on legibility — clear handwriting in forms and annotations is generally recognized well, while cursive or heavily stylized handwriting may have lower accuracy. For best results, ensure scanned images are at least 200 DPI.
Accuracy depends on document quality and complexity. For well-formatted business documents like contracts, reports, and forms, Parse achieves high extraction rates. The AI understands context and layout, so it handles varied formatting better than rule-based systems. You can improve accuracy by providing detailed schemas with field descriptions.
Get your API key and parse your first document in minutes. 100 pages per month free — no credit card required.