Receipt extraction

Every receipt.Categorized and reconciled.

Photo, scan, email PDF — DocParse pulls merchant, amount, tax, tip, items, and category. Wired into expense tools or your finance database.

Start extracting See the API

100 pages free No card SOC 2 in progress

app.docparse.io / extractions / starbucks-04142026.jpg

Live

STARBUCKS

Mar 14, 2026 · 9:14 AM

MerchantStarbucks #4421

DateMarch 14, 2026

Subtotal$11.40

Tax$0.91

Tip$2.30

Total$14.61

JSONCSVWebhook98.6% confidence

{
  "merchant": "Starbucks #4421",
  "date": "2026-03-14",
  "category": "meals_travel",
  "subtotal": 11.40,
  "tax": 0.91,
  "tip": 2.30,
  "total": 14.61,
  "currency": "USD"
}

Extracted in 2.4s · 8 fields

Scroll

Pipe extracted data into any of these — via Zapier or signed webhooks

Zapier

Google Drive

Gmail

Slack

Sheets

Notion

Airtable

Webhook

Zapier

Google Drive

Gmail

Slack

Sheets

Notion

Airtable

Webhook

Why DocParse

Three reasons teams switch to us

Reads any receipt format

Photos, crumpled scans, email PDFs, hotel folios. Tax, tip, and items broken out separately every time.

99.6%merchant accuracy

Auto-categorized

Every receipt tagged: meals, travel, supplies, software, mileage. Maps to your GL codes or to common expense policies.

180+category rules built in

Catches duplicates

Content-hash + merchant-time-amount fingerprint catches resubmits and accidental double-uploads before they clear.

0.02%duplicates slip through

How it works

From raw receipts
to structured data, in four steps.

Drop document, paste URL, or POST file

PDFPNGJPGTIFFDOCXHEICHTMLEMLXLSX

The schema

Starter schema for receipts.
Tweakable in seconds.

The receipts template comes with a 10-field starter schema based on the most common fields teams pull from receipts. Add your own fields, mark which are required, and change types in the dashboard or via the REST API.

Receipts · default schema

merchantstringrequired99.6%

datestringrequired99.4%

subtotalnumberrequired99.5%

taxnumberoptional99.0%

tipnumberoptional98.4%

totalnumberrequired99.7%

currencystringrequired99.9%

categorystringrequired98.2%

itemsarrayoptional96.8%

payment_last4stringoptional99.1%

JSON SchemaTypeScriptPython

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Receipts",
  "type": "object",
  "required": [
    "merchant",
    "date",
    "subtotal",
    "total",
    "currency",
    "category"
  ],
  "properties": {
    "merchant": {
      "type": "string"
    },
    "date": {
      "type": "string"
    },
    "subtotal": {
      "type": "number"
    },
    "tax": {
      "type": "number"
    },
    "tip": {
      "type": "number"
    },
    "total": {
      "type": "number"
    },
    "currency": {
      "type": "string"
    },
    "category": {
      "type": "string"
    },
    "items": {
      "type": "array"
    },
    "payment_last4": {
      "type": "string"
    }
  }
}

What to expect

Field-level accuracy per field.

Multi-modal models do the reading, and accuracy depends on document quality. The numbers below are illustrative ranges we've seen on receipts — run your own documents and compare against a small ground-truth set before you scale.

99.1%

illustrative field-level
accuracy ceiling

10starter fields

Anylanguage supported

25 MBmax file size

FieldAccuracy

merchant

99.6%

date

99.4%

total

99.7%

subtotal

99.5%

tax

99%

tip

98.4%

One endpoint.
Every output you need.

# Extract with one POST
curl -X POST "https://api.docparse.io/v1/receipts" \
  -H "Authorization: Bearer $DOCPARSE_KEY" \
  -F file=@"starbucks-04142026.jpg" \
  -F schema="receipt" \
  -F webhook="https://api.acme.co/incoming"

# Returns:
{
  "status": "complete",
  "confidence": 0.987,
  "latency_ms": 2412,
  "data": { ... }
}

Plain HTTP, no SDK lock-in

Bearer-token auth with revocable, SHA-256-hashed API keys. Call it from any language that can hit a REST endpoint — we publish docs and copy-pasteable snippets, not opinionated wrappers.

cURLPythonNode.jsGoRubyPHPJava.NET

Signed webhooks for async

Register an endpoint, set the events, and we POST signed deliveries (HMAC-SHA256, Standard Webhooks spec) as extractions finish. Every attempt is logged in the dashboard with response code, body, and timing.

Webhook delivery log · per-endpoint retries

The alternatives

Why teams switch from regex.

A look at how DocParse compares to the three things you've probably already tried.

Regex + scripts

Manual review (BPO)

Veryfi / Mindee

DocParse

Works on a layout it has never seen

partial

Handles handwriting and scans

partial

Custom fields without per-vendor setup

Multi-lingual out of the box

partial

REST API + signed webhooks + Zapier

partial

Pricing scales with pages, not seats

partial

Free tier — 100 pages on signup

partial

Time-to-first-extraction

Days

Weeks

5 minutes

Where the data goes

Reach the tools you already run.

DocParse ships two integration surfaces directly — REST API and signed webhooks — plus a native Zapier app that opens up everything else.

Zapier

Automation

Webhooks

API

REST API

API

JSON export

Export

CSV export

Export

Google Drive

via Zapier

Google Sheets

via Zapier

Gmail

via Zapier

Outlook

via Zapier

Slack

via Zapier

Dropbox

via Zapier

Airtable

via Zapier

Notion

via Zapier

HubSpot

via Zapier

Salesforce

via Zapier

Make.com

via Webhooks

n8n

via Webhooks

Postgres

via Webhooks

REST API · Signed webhooks (HMAC-SHA256) · Zapier to 6,000+ apps · JSON / CSV export

Common patterns

How teams use DocParse for receipts.

Illustrative scenarios drawn from teams piloting DocParse — names and figures are examples, not customer quotes.

“

Veryfi miscategorized 12% of our receipts. DocParse is at 1.8%. Our finance team stopped doing the weekly cleanup pass entirely.

Cole Patterson

Director, Finance · Tidemark

−10ptsmiscategorization rate

“

Photos in any orientation, lighting, or lens distortion — it just works. Field travelers no longer get rejected expenses.

Beatriz Almeida

Travel Ops · Northwave

0field rejections in Q1

“

We integrated DocParse into our internal expense tool in an afternoon. The duplicate-detection alone caught $48k of resubmits in year one.

Henrik Olsen

CFO · Quartile

$48kduplicates caught in y1

Frequently asked

The questions teams ask before they sign up.

How well does it work on phone photos?

Trained on millions of phone-camera receipts. Handles glare, blur, rotation, partial occlusion, and crumples up to about 30% surface area.

Does it handle non-USD currency?

Yes — currency is auto-detected from symbol, ISO code, and merchant location. Optionally converted to a base currency at the receipt date FX rate.

Can it pull individual line items?

Optional — schema flag "items: true" returns each row with description, quantity, unit price, and category.

Does it detect duplicates?

Yes. Content-hash and a merchant+time+amount fingerprint catch resubmits across the prior 12 months.

Stop chasing receipts.

Photo in, structured data out. Auto-categorized, deduped, and ready for your expense tool or GL.

Start extracting Talk to sales

Free for first 100 pages 5-minute setup No credit card

Every receipt.Categorized and reconciled.

Three reasons teams switch to us

Reads any receipt format

Auto-categorized

Catches duplicates

From raw receipts
to structured data, in four steps.

Define the fields

Upload the file

Multi-modal AI reads it

Get structured data back

Starter schema for receipts.
Tweakable in seconds.

Field-level accuracy per field.

One endpoint.
Every output you need.

Plain HTTP, no SDK lock-in

Signed webhooks for async

Why teams switch from regex.

Reach the tools you already run.

How teams use DocParse for receipts.

The questions teams ask before they sign up.

Stop chasing receipts.

Every receipt.Categorized and reconciled.

Three reasons teams switch to us

Reads any receipt format

Auto-categorized

Catches duplicates

From raw receiptsto structured data, in four steps.

Define the fields

Upload the file

Multi-modal AI reads it

Get structured data back

Starter schema for receipts.Tweakable in seconds.

Field-level accuracy per field.

One endpoint.Every output you need.

Plain HTTP, no SDK lock-in

Signed webhooks for async

Why teams switch from regex.

Reach the tools you already run.

How teams use DocParse for receipts.

The questions teams ask before they sign up.

Stop chasing receipts.

From raw receipts
to structured data, in four steps.

Starter schema for receipts.
Tweakable in seconds.

One endpoint.
Every output you need.