Invoice Extraction
Extract vendor information, line items, totals, and dates from invoices. Parse handles a wide range of invoice formats including PDFs, scanned images, and multi-page documents.
Invoice Schema
Define a schema that captures the key fields from an invoice:
{
"name": "invoice",
"description": "Extract data from invoices",
"fields": [
{ "name": "vendor_name", "type": "string", "description": "Company name of the vendor" },
{ "name": "vendor_address", "type": "string", "description": "Address of the vendor" },
{ "name": "invoice_number", "type": "string", "description": "Invoice reference number" },
{ "name": "date", "type": "date", "description": "Invoice issue date" },
{ "name": "due_date", "type": "date", "description": "Payment due date" },
{ "name": "line_items", "type": "array", "description": "List of billed items", "items": {
"type": "object",
"fields": [
{ "name": "description", "type": "string" },
{ "name": "quantity", "type": "number" },
{ "name": "unit_price", "type": "number" },
{ "name": "amount", "type": "number" }
]
}},
{ "name": "subtotal", "type": "number", "description": "Subtotal before tax" },
{ "name": "tax", "type": "number", "description": "Tax amount" },
{ "name": "total", "type": "number", "description": "Total amount due" },
{ "name": "currency", "type": "string", "description": "Currency code (e.g. USD, EUR)" }
]
}Code Examples
cURL
curl -X POST https://api.parse.conversiontools.io/v1/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@invoice.pdf" \
-F "schema=invoice"Python
import requests
headers = {
"Authorization": "Bearer YOUR_API_KEY"
}
with open("invoice.pdf", "rb") as f:
response = requests.post(
"https://api.parse.conversiontools.io/v1/extract",
headers=headers,
files={"file": f},
data={"schema": "invoice"}
)
data = response.json()
print(f"Vendor: {data['data']['vendor_name']}")
print(f"Total: {data['data']['currency']} {data['data']['total']}")
for item in data["data"]["line_items"]:
print(f" - {item['description']}: {item['amount']}")Node.js
const fs = require("fs");
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", "invoice");
const response = await fetch("https://api.parse.conversiontools.io/v1/extract", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
},
body: form,
});
const { data } = await response.json();
console.log(`Vendor: ${data.vendor_name}`);
console.log(`Total: ${data.currency} ${data.total}`);
data.line_items.forEach((item) => {
console.log(` - ${item.description}: ${item.amount}`);
});Sample Output
{
"success": true,
"id": "ext_inv_001",
"data": {
"vendor_name": "TechSupply Inc.",
"vendor_address": "123 Business Ave, Suite 400, San Francisco, CA 94105",
"invoice_number": "INV-2024-0847",
"date": "2024-01-15",
"due_date": "2024-02-14",
"line_items": [
{
"description": "Cloud Hosting - Standard Plan",
"quantity": 1,
"unit_price": 299.00,
"amount": 299.00
},
{
"description": "SSL Certificate - Wildcard",
"quantity": 2,
"unit_price": 49.99,
"amount": 99.98
},
{
"description": "Technical Support - Premium",
"quantity": 1,
"unit_price": 150.00,
"amount": 150.00
}
],
"subtotal": 548.98,
"tax": 49.41,
"total": 598.39,
"currency": "USD"
},
"pages_used": 1,
"confidence": 0.97
}Tips for Invoice Extraction
- Multi-page invoices are supported — all pages are processed and data is combined automatically
- Currency is detected automatically; add a currency field to your schema to capture it explicitly
- Use the array type for line items to handle invoices with varying numbers of items
- For scanned invoices, ensure the image is at least 150 DPI for best results