Claims Automation with GenAI at National Insurance Provider

The Challenge

A national insurance provider handling 800,000 claims annually was drowning in paper. The average claim took 15 days to process from first notice of loss to resolution. Each claim required a human adjuster to manually review submitted documents — police reports, medical records, repair estimates, policy documents, photos — extract relevant information, cross-reference it against policy terms, check for inconsistencies, and make a determination. The process was labor-intensive, error-prone, and maddeningly slow for customers.

Claims volume was growing at 20% year over year, but staffing wasn't keeping pace. The company had 340 claims adjusters, each handling an average of 45 open claims simultaneously. Backlogs were building, quality was slipping, and customer satisfaction had declined for three consecutive quarters. The Net Promoter Score dropped from 42 to 29, and complaints to state insurance regulators had increased 35%.

The leadership team recognized that incremental process improvements wouldn't solve a structural problem. They needed to fundamentally rethink how claims were processed — not to eliminate adjusters, but to free them from routine document review so they could focus on complex claims that genuinely required human judgment, empathy, and negotiation skills.

Our Approach

Week 1-2: Claims Process Mapping & Document Analysis

We embedded with the claims team for two weeks, observing adjusters as they processed claims across auto, property, and liability lines. We mapped every step of the workflow, identified bottlenecks, and categorized claims by complexity. The critical insight: roughly 70% of claims were straightforward — the documentation was clear, the liability was unambiguous, and the payout fell within standard guidelines. These claims didn't require expert judgment; they required accurate document processing at speed.

We analyzed 50,000 historical claims to understand the document landscape: 23 distinct document types, varying quality (faxed forms, phone photos, scanned PDFs), and an average of 8.4 documents per claim. We built a document taxonomy and established extraction schemas for each type — what fields needed to be captured, what cross-references needed to be checked, and what red flags should trigger human review.

Week 3-5: GenAI Pipeline Development

We built a multi-stage GenAI pipeline that processed claims end-to-end. The architecture had four core components:

Document ingestion and classification: An OCR and layout analysis layer that handled diverse document formats, followed by a fine-tuned classifier that routed each document to the appropriate extraction pipeline
Intelligent extraction: Claude-powered extraction chains that pulled structured data from unstructured documents — claimant information, incident details, damage descriptions, monetary amounts, dates, and policy references — with confidence scores for each field
Validation and cross-referencing: Automated checks against policy terms, coverage limits, deductible schedules, and prior claim history. The system flagged inconsistencies — a repair estimate exceeding the vehicle's value, a claimed incident date outside the policy period, duplicate submissions
Decision engine: A rules-and-ML hybrid that categorized claims into three tracks: auto-approve (straightforward, low-value, all validations passed), fast-track (needs brief human review of specific flagged items), and complex (requires full adjuster attention)

We used retrieval-augmented generation (RAG) with Pinecone to give the extraction and validation models access to the complete policy document library, state-specific regulations, and historical claim precedents. This let the system reason about edge cases using the same reference material that experienced adjusters relied on.

Week 6-7: Human-in-the-Loop Integration

The system was designed around human oversight, not human replacement. We built a review interface where adjusters could see the AI's extraction results, validation checks, and recommended action alongside the original documents. For auto-approved claims, the system processed and paid without human intervention but flagged a random 5% sample for quality audit. For fast-track claims, adjusters saw pre-filled summaries with specific items highlighted for review, reducing their per-claim time from 45 minutes to 8 minutes.

We implemented a feedback loop where adjuster corrections were captured and used to fine-tune the extraction models weekly. This created a virtuous cycle — the more the system processed, the more accurate it became, and the more claims moved from fast-track to auto-approve.

Week 8-9: Pilot, Monitoring & Rollout

We piloted on auto claims first — the highest volume line with the most standardized documentation. The pilot processed 12,000 claims over two weeks with real-time accuracy monitoring. We tracked extraction accuracy, validation precision, auto-approval accuracy (verified by sampling), and processing time. After confirming performance met the agreed thresholds, we extended to property claims and then liability claims in sequence.

Tech Stack

Python Claude API LangChain Pinecone FastAPI PostgreSQL AWS Celery

Results

Within the first full quarter of production deployment across all claims lines:

60% faster processing — average turnaround dropped from 15 days to 6 days, with auto-approved claims resolving in under 24 hours
40% cost reduction — per-claim processing cost decreased from $142 to $85, driven by automation of routine document review and data entry
93% customer satisfaction score — up from 71%, primarily driven by faster payouts and proactive status updates
70% of straightforward claims auto-processed — these claims went from first notice of loss to payment without human intervention, freeing adjusters for complex work
Extraction accuracy of 96.8% — exceeding the 95% target, with continuous improvement through the adjuster feedback loop

Arkyon didn't just automate our claims process — they redesigned it around the strengths of both AI and our people. Our adjusters now spend their time on claims that actually need their expertise, and our customers get faster, better outcomes. This is what responsible AI adoption looks like.

D.L. — Chief Claims Officer, National Insurance Provider

What Made This Work

Human-in-the-loop by design — the system was built to augment adjusters, not replace them, which drove adoption and ensured quality on edge cases that AI alone would get wrong
Tiered processing tracks — auto-approve, fast-track, and complex routing matched the right level of human oversight to each claim's actual complexity
RAG for policy reasoning — giving the AI access to the full policy library and regulatory context meant it could handle nuanced coverage questions, not just simple data extraction
Continuous learning loop — adjuster corrections fed back into model fine-tuning weekly, creating compounding accuracy improvements over time