Salesforce Einstein got intent right 78% of the time. DocParse hits 97%, and it auto-extracts attachments. Our CSAT jumped four points.
Every email thread.Boiled down to JSON.
Customer support, order confirmations, intake forms, B2B inquiries. DocParse pulls the entities, intent, and action items from any email — including attachments.
{
"thread_id": "th_44821",
"customer": "Acme Industries",
"order_no": "PR-1148",
"intent": "status_request",
"sentiment": "resolved",
"action_items": ["dispatch_eta","notify_customer"],
"attachments": [ 2 ]
}Three reasons teams switch to us
Threads, not just messages
DocParse stitches an email thread, including forwarded snippets and quoted replies, into one structured record.
Entities, intent, and sentiment
Order numbers, customer names, action items, and sentiment all extracted with confidence and source span.
Attachments handled inline
PDFs, images, and screenshots in attachments are extracted and merged with the email content into one structured record.
From raw emails
to structured data, in four steps.
Starter schema for emails.
Tweakable in seconds.
The emails template comes with a 10-field starter schema based on the most common fields teams pull from emails. Add your own fields, mark which are required, and change types in the dashboard or via the REST API.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Emails",
"type": "object",
"required": [
"thread_id",
"subject",
"participants",
"customer",
"intent",
"entities",
"language"
],
"properties": {
"thread_id": {
"type": "string"
},
"subject": {
"type": "string"
},
"participants": {
"type": "array"
},
"customer": {
"type": "string"
},
"intent": {
"type": "string"
},
"sentiment": {
"type": "string"
},
"entities": {
"type": "array"
},
"action_items": {
"type": "array"
},
"attachments": {
"type": "array"
},
"language": {
"type": "string"
}
}
}Field-level accuracy per field.
Multi-modal models do the reading, and accuracy depends on document quality. The numbers below are illustrative ranges we've seen on emails — run your own documents and compare against a small ground-truth set before you scale.
accuracy ceiling
One endpoint.
Every output you need.
# Extract with one POST
curl -X POST "https://api.docparse.io/v1/emails" \
-H "Authorization: Bearer $DOCPARSE_KEY" \
-F file=@"support-ticket-44821.eml" \
-F schema="email-thread" \
-F webhook="https://api.acme.co/incoming"
# Returns:
{
"status": "complete",
"confidence": 0.987,
"latency_ms": 2412,
"data": { ... }
}Plain HTTP, no SDK lock-in
Bearer-token auth with revocable, SHA-256-hashed API keys. Call it from any language that can hit a REST endpoint — we publish docs and copy-pasteable snippets, not opinionated wrappers.
Signed webhooks for async
Register an endpoint, set the events, and we POST signed deliveries (HMAC-SHA256, Standard Webhooks spec) as extractions finish. Every attempt is logged in the dashboard with response code, body, and timing.
Why teams switch from regex.
A look at how DocParse compares to the three things you've probably already tried.
Reach the tools you already run.
DocParse ships two integration surfaces directly — REST API and signed webhooks — plus a native Zapier app that opens up everything else.
How teams use DocParse for emails.
Illustrative scenarios drawn from teams piloting DocParse — names and figures are examples, not customer quotes.
Customers fwd a half-PDF in the body, screenshot a number, and ask for help. DocParse glues the whole thread into one structured record. Magic.
We auto-route 92% of inbound emails based on DocParse intent + entity tags. Our triage queue is down to 30 minutes from a full day.
The questions teams ask before they sign up.
Email to JSON. Threads to action.
Inbound email is unstructured by default. Make it the cleanest data source you have.