How We Built an OCR System for Local ID Cards
Curious about how machine learning can revolutionize the processing of local identification documents? You're in the right place. At 12iD, we've developed an Optical Character Recognition (OCR) system specifically tailored for local ID cards, such as Nepali driving licenses and passports.
In this blog post, we highlight the work of Sakar Gimire, one of our Machine Learning Engineers. Sakar has been instrumental in developing and integrating AI solutions across our projects. Here, we take you through the journey of building our OCR system, from its inception to overcoming challenges, and showcase how this innovative solution is set to transform digital identity verification.
Segmentation and Training
To start, we trained a model to recognize and segment ID cards from images. We gathered numerous images of different ID cards, annotated them by marking the exact areas where the ID cards are located in each image, and used this data to teach our model how to identify and isolate the ID cards in any given image.
Image Processing
Once an image is input into the system, the segmentation model detects and separates the ID card from the rest of the image. This ensures that we focus solely on the ID card, simplifying the task of reading the information on it.
OCR and Classification
After isolating the ID card, we apply OCR technology to read the text on the card. This involves identifying and extracting text such as names, dates of birth, and license numbers. We then classify the card type (e.g., Nepali License, Nepali Passport, Bangladeshi License) based on the extracted text and the card's design.
Template Matching
To accurately locate specific data fields on the card, we created templates for each type of card. These templates contain information about where important data fields (like name, address, etc.) are typically located. By comparing the position of texts in the input image to the position mentioned in the template, we can predict the exact positions of these fields.
Data Extraction
Using the predicted positions, we extract the relevant information from the card. For instance, we can determine the position of the name field and extract the text found in that area.
Handling Local Variations
One of the standout features of our system is its ability to handle the unique aspects of local cards. This includes different languages, fonts, and card layouts. We designed the system to be flexible and adaptable to these variations, ensuring accurate and reliable data extraction.
Overcoming Challenges
We faced several challenges, such as dealing with varying image quality and complex card layouts. To address these, we implemented robust techniques to improve image processing and ensure the accuracy of the extracted information.
Conclusion
In summary, our OCR system for local ID cards is a robust and adaptable solution that can accurately read and process various types of local identification documents. We are continually working on improving the system and look forward to incorporating more features and enhancements in the future.