Invoice Extraction

Extract vendor information, line items, totals, and dates from invoices. Parse handles a wide range of invoice formats including PDFs, scanned images, and multi-page documents.

Invoice Schema

Define a schema that captures the key fields from an invoice:

{
  "name": "invoice",
  "description": "Extract data from invoices",
  "fields": [
    { "name": "vendor_name", "type": "string", "description": "Company name of the vendor" },
    { "name": "vendor_address", "type": "string", "description": "Address of the vendor" },
    { "name": "invoice_number", "type": "string", "description": "Invoice reference number" },
    { "name": "date", "type": "date", "description": "Invoice issue date" },
    { "name": "due_date", "type": "date", "description": "Payment due date" },
    { "name": "line_items", "type": "array", "description": "List of billed items", "items": {
      "type": "object",
      "fields": [
        { "name": "description", "type": "string" },
        { "name": "quantity", "type": "number" },
        { "name": "unit_price", "type": "number" },
        { "name": "amount", "type": "number" }
      ]
    }},
    { "name": "subtotal", "type": "number", "description": "Subtotal before tax" },
    { "name": "tax", "type": "number", "description": "Tax amount" },
    { "name": "total", "type": "number", "description": "Total amount due" },
    { "name": "currency", "type": "string", "description": "Currency code (e.g. USD, EUR)" }
  ]
}

Code Examples

cURL

curl -X POST https://api.parse.conversiontools.io/v1/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@invoice.pdf" \
  -F "schema=invoice"

Python

import requests

headers = {
    "Authorization": "Bearer YOUR_API_KEY"
}

with open("invoice.pdf", "rb") as f:
    response = requests.post(
        "https://api.parse.conversiontools.io/v1/extract",
        headers=headers,
        files={"file": f},
        data={"schema": "invoice"}
    )

data = response.json()
print(f"Vendor: {data['data']['vendor_name']}")
print(f"Total: {data['data']['currency']} {data['data']['total']}")

for item in data["data"]["line_items"]:
    print(f"  - {item['description']}: {item['amount']}")

Node.js

const fs = require("fs");

const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", "invoice");

const response = await fetch("https://api.parse.conversiontools.io/v1/extract", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_KEY",
  },
  body: form,
});

const { data } = await response.json();
console.log(`Vendor: ${data.vendor_name}`);
console.log(`Total: ${data.currency} ${data.total}`);

data.line_items.forEach((item) => {
  console.log(`  - ${item.description}: ${item.amount}`);
});

Sample Output

{
  "success": true,
  "id": "ext_inv_001",
  "data": {
    "vendor_name": "TechSupply Inc.",
    "vendor_address": "123 Business Ave, Suite 400, San Francisco, CA 94105",
    "invoice_number": "INV-2024-0847",
    "date": "2024-01-15",
    "due_date": "2024-02-14",
    "line_items": [
      {
        "description": "Cloud Hosting - Standard Plan",
        "quantity": 1,
        "unit_price": 299.00,
        "amount": 299.00
      },
      {
        "description": "SSL Certificate - Wildcard",
        "quantity": 2,
        "unit_price": 49.99,
        "amount": 99.98
      },
      {
        "description": "Technical Support - Premium",
        "quantity": 1,
        "unit_price": 150.00,
        "amount": 150.00
      }
    ],
    "subtotal": 548.98,
    "tax": 49.41,
    "total": 598.39,
    "currency": "USD"
  },
  "pages_used": 1,
  "confidence": 0.97
}

Tips for Invoice Extraction

  • Multi-page invoices are supported — all pages are processed and data is combined automatically
  • Currency is detected automatically; add a currency field to your schema to capture it explicitly
  • Use the array type for line items to handle invoices with varying numbers of items
  • For scanned invoices, ensure the image is at least 150 DPI for best results
Invoice Extraction | Parse Examples