|
minute read
Customer Background & Business Challenge
Our customer is an insurance company operating a high-volume invoice fraud detection pipeline. The existing optical character recognition (OCR) based approach generated a significant number of false positives due to document variability and malformed invoices or bad scans. Our client needed a more accurate and resilient solution to improve data extraction quality and categorization.
Generative AI Solution & AWS Services Deployed
Invoices are processed using Textract for OCR followed by Bedrock to structure and normalize extracted invoice data using an LLM. Both calls are integrated into an existing industrialized pipeline for identifying potential fraudulent invoices. The solution is fully deployed in a production AWS Environment leveraging S3, Textract and Bedrock using Anthropic Models.
This implementation allows seamless integration into an existing pipeline. Further implementation will allow event-driven architecture to analyze invoices in near-real time leveraging S3, EventBridge, Textract and Bedrock. Amazon LLMs are considered to reduce costs as well.
Quantitative Business Metrics & Financial Impact
The quantitative business metrics used to assess the ROI of the application is the accuracy of the data extracted from the solution. For each field (Invoice id, phone, total VAT, Km, IBAN...) a measure was made to determine the accuracy of the previous method and the new one (Textract + LLM).