Invoice Extraction
Extract vendor information, line items, totals, and dates from invoices. Parse handles a wide range of invoice formats including PDFs, scanned images, and multi-page documents.
Invoice Schema
Define a schema that captures the key fields from an invoice:
{"name": "invoice","description": "Extract data from invoices","fields": [{ "name": "vendor_name", "type": "string", "description": "Company name of the vendor" },{ "name": "vendor_address", "type": "string", "description": "Address of the vendor" },{ "name": "invoice_number", "type": "string", "description": "Invoice reference number" },{ "name": "date", "type": "date", "description": "Invoice issue date" },{ "name": "due_date", "type": "date", "description": "Payment due date" },{ "name": "line_items", "type": "array", "description": "List of billed items", "items": {"type": "object","fields": [{ "name": "description", "type": "string" },{ "name": "quantity", "type": "number" },{ "name": "unit_price", "type": "number" },{ "name": "amount", "type": "number" }]}},{ "name": "subtotal", "type": "number", "description": "Subtotal before tax" },{ "name": "tax", "type": "number", "description": "Tax amount" },{ "name": "total", "type": "number", "description": "Total amount due" },{ "name": "currency", "type": "string", "description": "Currency code (e.g. USD, EUR)" }]}
Code Examples
cURL
curl -X POST https://api-parse.conversiontools.io/v1/extract \-H "Authorization: Bearer YOUR_API_KEY" \-F "file=@invoice.pdf" \-F "schema_id=YOUR_INVOICE_SCHEMA_ID"
Python
import requestsAPI_KEY = "YOUR_API_KEY"headers = {"Authorization": f"Bearer {API_KEY}"}with open("invoice.pdf", "rb") as f:response = requests.post("https://api-parse.conversiontools.io/v1/extract",headers=headers,files={"file": f},data={"schema_id": "YOUR_INVOICE_SCHEMA_ID"},)data = response.json()print(f"Vendor: {data['data']['vendor_name']}")print(f"Total: {data['data']['currency']} {data['data']['total']}")for item in data["data"]["line_items"]:print(f" - {item['description']}: {item['amount']}")
Node.js
const fs = require("fs");const API_KEY = "YOUR_API_KEY";const headers = { Authorization: `Bearer ${API_KEY}` };const form = new FormData();form.append("file", fs.createReadStream("invoice.pdf"));form.append("schema_id", "YOUR_INVOICE_SCHEMA_ID");const response = await fetch("https://api-parse.conversiontools.io/v1/extract", {method: "POST",headers,body: form,});const data = await response.json();console.log(`Vendor: ${data.data.vendor_name}`);console.log(`Total: ${data.data.currency} ${data.data.total}`);data.data.line_items.forEach((item) => {console.log(` - ${item.description}: ${item.amount}`);});
Sample Output
{"success": true,"id": "ext_inv_001","data": {"vendor_name": "TechSupply Inc.","vendor_address": "123 Business Ave, Suite 400, San Francisco, CA 94105","invoice_number": "INV-2024-0847","date": "2024-01-15","due_date": "2024-02-14","line_items": [{"description": "Cloud Hosting - Standard Plan","quantity": 1,"unit_price": 299.00,"amount": 299.00},{"description": "SSL Certificate - Wildcard","quantity": 2,"unit_price": 49.99,"amount": 99.98},{"description": "Technical Support - Premium","quantity": 1,"unit_price": 150.00,"amount": 150.00}],"subtotal": 548.98,"tax": 49.41,"total": 598.39,"currency": "USD"},"pages_used": 1,"confidence": 0.97}
Tips for Invoice Extraction
- Multi-page invoices are supported — all pages are processed and data is combined automatically
- Currency is detected automatically; add a currency field to your schema to capture it explicitly
- Use the array type for line items to handle invoices with varying numbers of items
- For scanned invoices, ensure the image is at least 150 DPI for best results