When I first started working on an OCR-based document processing system at Generali Central, the problem sounded deceptively simple:
“Can we automatically read documents and speed up claims processing?”
From a purely technical lens, this felt like a classic computer vision + NLP task. Train an OCR model, extract fields, validate accuracy, ship.
In reality, building an AI product inside a live insurance operation turned out to be far more about people, trust, and workflow design than model performance alone.
This post walks through the entire journey : from talking to users on the ground, to discovering why high accuracy wasn’t enough, to designing systems that employees actually trusted and adopted.
Like most engineers, I began with a bias:
If the model is accurate enough, adoption will follow.
Early experiments supported this belief. On historical documents, the OCR pipeline was achieving high accuracy across most structured fields.
On paper, it was a success.
But when we piloted the system with operations and claims teams, something strange happened:
That’s when I realized: Accuracy alone does not reduce operational time.
Instead of tuning the model further, I stepped away from the code and started doing what I probably should’ve done earlier - talking to users.
I scheduled meets with:
I asked simple questions:
The answers were eye-opening.
A few consistent themes emerged:
Even if 9 out of 10 fields were correct, the 1 uncertain field forced a full manual review.
The OCR struggled with:
These were precisely the fields that mattered most.
The system gave outputs, but no indication of how reliable they were.
So employees played it safe:
“If I don’t know how confident the system is, I’ll just verify everything.”
From their perspective, this was rational behavior.
At this point, the problem was no longer:
“How do we improve OCR accuracy?”
It became:
“How do we reduce turnaround time without increasing risk?”
This shift changed every design decision that followed.
One of the highest-impact changes we made was deceptively simple.
Instead of treating OCR output as free-form text, we:
This immediately reduced:
This wasn’t a new ML model — it was product thinking layered on top of AI.
The next major shift was adding confidence scores at the field level.
Instead of:
“Here is the extracted value.”
The system now said:
“Here is the value, and here is how confident I am.”
This small change had outsized effects:
Crucially, we did not hide uncertainty. We made it explicit.
That transparency built trust faster than any model improvement could.
We also mapped the entire claims workflow, end to end:
This helped us:
Success was no longer “OCR accuracy”. It was:
With these changes:
None of this required chasing the last decimal of model accuracy.
This experience fundamentally changed how I think about AI systems.
AI accuracy is not the product The product is the workflow improvement it enables.
Trust is a first-class feature Confidence scores, transparency, and explainability are not “nice-to-haves”.
Domain knowledge beats generic intelligence A simple vocabulary database outperformed complex model tweaks.
User behavior defines success, not dashboards If people don’t change how they work, your model hasn’t solved the problem.
Managing this OCR product at Generali Central taught me that building AI products is less about clever algorithms and more about thoughtful system design.
The most impactful changes didn’t come from the model — they came from listening to users, understanding their risk, and designing systems that respected how real work gets done.
That mindset now shapes how I approach every AI-driven product I build.