DocParse turns PDFs, images, and DOCX files into clean JSON. Define the fields you need, upload the file, get structured data back—in any language, no templates required.
{
"vendor": "Acme Supply Co.",
"invoice_no": "INV-20294",
"bill_to": "Northwind Logistics",
"date": "2026-04-14",
"line_items": [
{ "Steel pallet", 12, 840.00 },
{ "Strapping", 4, 162.00 }
],
"total": 1842.50,
"currency": "USD"
}Tell DocParse which fields you need. It reads the document and returns them as clean JSON—ready for your database, webhook, or Zap.
{
"vendor": "Acme Supply Co.",
"invoice_no": "INV-20294",
"bill_to": "Northwind Logistics",
"date": "2026-04-14",
"total": "$1,842.50"
}You decide what to pull out: vendor names, totals, dates, line items, anything. DocParse fills your schema in seconds—no templates, no training, no fine-tuning.
DocParse reads layouts it has never seen before. You don't train it. You don't template it. You give it a document and tell it which fields you want.
PDFs, JPGs, PNGs, WEBP, and DOCX—up to 25 MB per file. Scans, photos, and digitally generated documents all go through the same pipeline.
Plain HTTP. Bearer-token auth with revocable keys. Signed outbound webhooks for async pipelines, and a native Zapier app so non-developers can ship too.
# 1. Create an extraction with the fields you want
curl https://api.docparse.io/api/v1/createExtraction \
-H "Authorization: Bearer $DOCPARSE_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Invoices",
"template": "invoice",
"fields": [
{ "name": "vendor", "type": "string" },
{ "name": "total", "type": "number" },
{ "name": "due_date", "type": "date" }
]
}'
# 2. Upload a file to it
curl https://api.docparse.io/api/v1/uploadFiles \
-H "Authorization: Bearer $DOCPARSE_KEY" \
-F "extractionId=ext_..." \
-F "files=@invoice.pdf"Your documents stay yours. TLS everywhere, hashed API keys, signed webhooks, and a processing pipeline that never retains your data for model training.
100 pages free, every month. No credit card. Pay in USD or INR when you need more.