QUANTIFYING VALUE
Benchmark Results
Not all AI models are created equal — especially on complex enterprise insurance work. Benchmarking is how we measure what actually delivers in underwriting and claims, so you can see where accuracy holds up and where rework starts. Pick a workflow below to see the field-level results.
Built by Insurance Experts, for Insurance Experts
InsurGPT™ is the generative AI model built by insurance operators, for insurance operators. Trained on millions of insurance-specific data points, it understands the unique language and complexities of the insurance industry. Experience AI tailored to solve your challenges, delivering accuracy and intelligent automation like never before.

Roots Outperforms General Knowledge LLMs
98%+ Accuracy Guaranteed
Bevaya Benchmark Results
Claims extraction
Field-level accuracy on common claims data extraction. Bevaya's fine-tuned model is shown with and without a 0.9 confidence threshold.
| Field | GPT-4% Accuracy | Mistral 7B PE% Accuracy | Bevaya FTNo threshold | Bevaya FTThreshold > 0.9 |
|---|---|---|---|---|
| Claim Number | 52 | 32 | 68 | 98 |
| Claimant Name | 98 | 88 | 99 | 100 |
| Date of Report | 87 | 72 | 92 | 98 |
| Date of Service | 78 | 57 | 76 | 98 |
Last updated: December 2025
Underwriting extraction
Overall accuracy across underwriting field extracts, compared with state-of-the-art frontier models.
| Field | GPT-4.1% Accuracy | GPT-5.0% Accuracy | Gemini 3.0 Pro% Accuracy | Bevaya% Accuracy |
|---|---|---|---|---|
| Overall Accuracy | 81 | 80 | 84 | 93 |
Last updated: December 2025
Loss run extraction
Field-level accuracy across loss run extraction, organized by data category.
| Field | GPT-4.1% Accuracy | GPT-5.0% Accuracy | Gemini 3.0 Pro% Accuracy | Bevaya% Accuracy |
|---|---|---|---|---|
| Accident Description | 88 | 53 | 81 | 94 |
| Accident State | 90 | 53 | 83 | 92 |
| Allocated Expense Reserves | 85 | 60 | 91 | 99 |
| Allocated Expenses Paid | 88 | 57 | 86 | 99 |
| Carrier | 75 | 56 | 74 | 93 |
| Claim Number | 83 | 54 | 76 | 93 |
| Claim Reported Date | 89 | 59 | 86 | 95 |
| Claimant Closed Date | 93 | 61 | 90 | 99 |
| Claimant Name | 88 | 58 | 84 | 97 |
| Date of Incident | 88 | 58 | 85 | 97 |
| Indemnity Paid | 66 | 53 | 81 | 98 |
| Indemnity Reserves | 80 | 53 | 84 | 98 |
| Line of Business | 91 | 60 | 86 | 98 |
| Medical Reserves | 86 | 60 | 91 | 98 |
| Nature of Injury | 53 | 40 | 72 | 90 |
| Paid Medical | 90 | 59 | 89 | 97 |
| Policy Number | 84 | 53 | 81 | 95 |
| Policy Year | 81 | 53 | 74 | 87 |
| Recoveries | 97 | 61 | 91 | 99 |
| Status | 70 | 46 | 69 | 96 |
| Total Incurred | 73 | 48 | 72 | 88 |
| Total Paid | 74 | 52 | 73 | 88 |
| Total Reserve | 69 | 54 | 83 | 93 |
| Type | 36 | 26 | 32 | 78 |
Last updated: December 2025
| Field | GPT-4.1% Accuracy | GPT-5.0% Accuracy | Gemini 3.0 Pro% Accuracy | Bevaya% Accuracy |
|---|---|---|---|---|
| Allocated Expense Reserves | 81 | 90 | 89 | 90 |
| Allocated Expenses Paid | 80 | 89 | 87 | 89 |
| Carrier | 73 | 72 | 75 | 90 |
| Experience Mod | 88 | 90 | 90 | 93 |
| Expiration Date | 72 | 65 | 74 | 81 |
| Inception Date | 81 | 74 | 82 | 82 |
| Indemnity Paid | 77 | 88 | 84 | 88 |
| Indemnity Reserves | 81 | 88 | 87 | 89 |
| Line of Business | 76 | 84 | 81 | 80 |
| Medical Reserves | 86 | 90 | 88 | 91 |
| Named Insured | 90 | 92 | 91 | 93 |
| Paid Medical | 84 | 87 | 84 | 89 |
| Policy Number | 72 | 77 | 81 | 86 |
| Policy Year | 86 | 86 | 88 | 92 |
| Recoveries | 86 | 91 | 89 | 92 |
| Total Claims | 84 | 88 | 83 | 87 |
| Total Closed Claims | 80 | 85 | 68 | 89 |
| Total Incurred | 75 | 78 | 79 | 83 |
| Total Open Claims | 89 | 92 | 85 | 91 |
| Total Paid | 69 | 72 | 73 | 76 |
| Total Reserve | 64 | 81 | 76 | 82 |
| Validation Date | 78 | 63 | 84 | 84 |
Last updated: December 2025
| Field | GPT-4.1% Accuracy | DeepSeek R1% Accuracy | Gemini 2.5 Flash% Accuracy | Bevaya% Accuracy |
|---|---|---|---|---|
| Indemnity Paid | 65 | 83 | 74 | 95 |
| Total Incurred | 66 | 52 | 71 | 74 |
| Recoveries | 52 | 64 | 71 | 100 |
| Paid Medical | 44 | 74 | 74 | 86 |
| Medical Reserves | 44 | 59 | 64 | 88 |
| Indemnity Reserves | 63 | 70 | 74 | 98 |
| Total Paid | 67 | 85 | 93 | 78 |
| Total Reserve | 59 | 69 | 81 | 97 |
| Allocated Expenses Paid | 86 | 84 | 93 | 93 |
| Allocated Expense Reserves | 76 | 55 | 100 | 100 |
Last updated: July 2025
| Field | GPT-4.1% Accuracy | DeepSeek R1% Accuracy | Gemini 2.5 Flash% Accuracy | Bevaya% Accuracy |
|---|---|---|---|---|
| Claimant Name | 80 | 90 | 94 | 94 |
| Date of Incident | 80 | 89 | 94 | 94 |
| Claim Reported Date | 83 | 89 | 82 | 93 |
| Claim Number | 45 | 43 | 61 | 94 |
| Line of Business | 76 | 95 | 95 | 100 |
| Carrier | 75 | 88 | 93 | 100 |
| Policy Number | 65 | 81 | 91 | 99 |
| Policy Year | 78 | 94 | 99 | 99 |
| Status | 57 | 88 | 68 | 99 |
| Accident State | 88 | 90 | 98 | 96 |
| Claimant Closed Date | 72 | 89 | 85 | 94 |
| Accident Description | 54 | 60 | 77 | 78 |
Last updated: July 2025
Loss run benchmarks are refreshed as new frontier models are released. December 2025 results reflect testing against GPT-4.1, GPT-5.0, and Gemini 3.0 Pro. July 2025 results reflect testing against GPT-4.1, DeepSeek R1, and Gemini 2.5 Flash.


