REST API · PDF to JSON · 100 free pages/mo

PDF Parsing API

Convert any PDF into structured JSON data with a single API call. Digital PDFs, scanned documents, multi-page contracts — send any PDF and get clean, typed JSON back. No templates, no training, no OCR pipelines to manage.

Start Parsing PDFs Free Read the Docs

One API Call to Parse Any PDF

Upload a PDF and define the fields you need. The API reads every page, understands the document structure, and returns structured JSON matching your schema.

request.sh

curl -X POST \
  https://api-parse.conversiontools.io/v1/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@contract.pdf" \
  -F 'schema={
    "parties": [{
      "name": "string",
      "role": "string"
    }],
    "effective_date": "string",
    "termination_date": "string",
    "payment_terms": "string",
    "total_value": "number",
    "clauses": [{
      "title": "string",
      "summary": "string"
    }]
  }'

response.json

{
  "status": "completed",
  "pages": 12,
  "data": {
    "parties": [
      {
        "name": "Acme Corp",
        "role": "Service Provider"
      },
      {
        "name": "Globex Inc",
        "role": "Client"
      }
    ],
    "effective_date": "2026-01-15",
    "termination_date": "2027-01-14",
    "payment_terms": "Net 30",
    "total_value": 84000.00,
    "clauses": [
      {
        "title": "Confidentiality",
        "summary": "Both parties agree..."
      }
    ]
  }
}

How the PDF Parsing API Works

Three steps to go from a raw PDF to structured JSON. Works with any PDF — digital, scanned, single-page, or multi-page.

Upload Your PDF

Send any PDF to the API endpoint — invoices, contracts, reports, forms, or scanned documents. Multi-page PDFs are processed automatically.

AI Reads Every Page

The API reads all pages, applies OCR for scanned content, understands the layout and context, and extracts the fields defined in your schema.

Get Structured JSON

Receive clean, typed JSON matching your schema. Ready to store in a database, feed into a pipeline, or display in your application.

Why Developers Choose Parse for PDF Parsing

Built for developers who need reliable PDF to JSON conversion without the complexity of traditional OCR and PDF parsing libraries.

OCR for Scanned PDFs

Automatically detects and applies OCR to scanned PDFs and image-based pages. Works with both digital and paper-origin PDFs without any configuration.

Simple REST API

One endpoint, one API call. Send a PDF, get JSON back. No SDKs required — works with curl, Python, Node.js, or any HTTP client.

Multi-Page Support

Processes all pages of a PDF document. Extracts data that spans across pages — line items, tables, and sections that continue on subsequent pages.

Custom Schemas

Define exactly which fields to extract with JSON schemas. Support for strings, numbers, dates, arrays, and nested objects. One schema works across different PDF layouts.

Privacy & Security

PDF files are processed and deleted automatically. No document data is stored after extraction. EU-hosted infrastructure with encrypted connections.

Fast Response Times

Most single-page PDFs are processed in seconds. Synchronous and asynchronous modes available depending on your document size and page count.

PDF Parsing for Every Document Type

The same API works across all your PDF processing needs. Define a schema once, parse data from thousands of PDF documents.

Data Extraction API

Extract structured data from any document format — PDFs, images, and scanned files.

Invoice Extraction

Extract vendor, amounts, line items, dates, and tax details from invoices automatically.

Receipt Parsing

Parse store names, totals, items, and payment methods from receipts and POS printouts.

Document Parsing

Extract fields from applications, surveys, tax forms, and government documents.

Frequently Asked Questions

Common questions about the PDF parsing API.

Can the PDF parsing API handle scanned PDFs?

Yes. Parse uses OCR to extract text from scanned PDFs and image-based documents. Whether your PDF was created digitally or scanned from paper, the API reads the content and returns structured JSON matching your schema.

How does the API handle multi-page PDF documents?

The API processes all pages of a PDF document automatically. You can parse multi-page contracts, reports, or statements and extract data that spans across pages — such as line items that continue on the next page or summary sections at the end.

What languages does the PDF parsing API support?

Parse supports PDFs in any language. The AI model understands multilingual documents and can extract fields regardless of the language used. This includes Latin, Cyrillic, CJK characters, Arabic, and other scripts.

How accurate is the PDF to JSON extraction?

Parse uses large language models that understand document context and layout, not simple text pattern matching. Accuracy depends on document quality, but most structured PDFs like invoices, receipts, and forms achieve high extraction rates. Scanned documents with clear text also perform well.

Is there a free tier for the PDF parsing API?

Yes. The free plan includes 100 pages per month with full API access, custom schemas, and OCR support. No credit card required to start. For higher volumes, the Pro plan supports 5,000 pages per month with priority processing.

Start Parsing PDFs Today

Get your API key and parse your first PDF in minutes. 100 pages per month free — no credit card required.

Get Started Free API Reference