OCR Document Management Solution for Logistics Leader

Client

A provider of freight brokerage services

Industry

Logistics & Transportation

Services

Software Development

Tech

Google Cloud, Python, RestAPI, JMagick, SageMaker

Challenge

Our client — a US-based provider of logistics services — was at a breaking point. The influx of documents from thousands of shippers and carriers was overwhelming their outdated, inefficient document management system. Manual processes used for cumbersome and time-consuming tasks like document indexing were holding them back. As the number of documents continued to rise, the urgency to switch to automated technology only grew stronger. The company turned to ITRex for ML development and computer vision services to build a proprietary OCR document management solution that would automate their processes and improve productivity.

Our task was to:

●

Conduct an in-depth examination of the client’s diverse data sources to identify and prioritize document types for incremental delivery to ensure a focused and effective rollout

●

Assess the variety of unique client formats to tailor the OCR solution

●

Develop bespoke ML algorithms to extract the necessary info for the Client from a recognized document

●

Train and fine-tune the ML models to handle diverse document types and formats with high accuracy

●

Design an intuitive and user-friendly interface for OCR tool management and monitoring, accessible to non-technical users

Solution

ITRex has developed a cutting-edge OCR document management solution, designed to seamlessly integrate into the client’s existing internal systems. This proprietary solution stands out with its user-friendly interface, accessible both to tech-savvy and non-technical users, and its ability to recognize an extensive range of document and data types with remarkable accuracy. Key features and capabilities: Versatile document recognition:

●

Capable of processing nearly 20 document types from carriers and shippers, with some types generated by up to 6,000 unique customers using unique formats

●

Achieves up to 90% recognition accuracy, enhanced through comprehensive training on a diverse dataset

Advanced ML for classification:

●

Classifies document types and identifiers, with the ML model trained on hundreds of documents to scale to over a million documents

Efficient processing:

●

Prioritizes documents for recognition based on length, optimizing processing time

●

Implements distributed processing for increased fault tolerance and scalability to ensure no loss of documents in the recognition pipeline, even as processing demand increases

●

Includes document rotation for enhanced recognition efficiency

●

Recognizes every field within a document, a significant improvement from the previous capability of recognizing only four fields manually

●

Identifies document quality (poor-quality documents, e.g., damaged or torn)

●

Generates dates and numbers in unified formats through post-processing

●

Utilizes ML models to recognize handwritten text and signatures

Integration with communication platforms:

●

Seamlessly integrates with messaging systems like Slack for real-time performance reporting

UI design:

●

Easy to navigate for non-tech users

●

Advanced features for more experienced users, allowing manual labeling (tagging) of unique document templates

●

Includes a dedicated monitoring page to track the performance and efficiency of the AI/ML engine

Impact

●

A dramatic reduction in operational costs through automation of manual processes that not only saves time but also minimizes the likelihood of human error

●

Enhanced productivity with a greater speed and accuracy of document processing, leading to a more streamlined workflow

●

The solution has laid the foundations for implementing a comprehensive big data and reporting platform for deeper insights into operational efficiences