← Blog

Talking in Terricone Valley about document recognition

Today, document recognition with a variety of OCR scanning solutions can meet the needs of any business or application. While some still need to be captured on paper, most are going digital.  

 

However, organizations often face challenges in verifying the authenticity of documents, which can lead to legal and financial consequences. Ainur Zhappasova, ML Engineer at Verigram, shared her experience in developing the document recognition solution called Veridoc at Terricon Valley. The event took place on March 20 and was designed to share experiences with the audience.

 

ML engineers must develop algorithms that can read data quickly and accurately, even as ID document formats evolve. This is an ongoing challenge for them.

 

Verigram recognizes more than 5 million documents each month. This amount of data provides a solid foundation for improving Veridoc. The speaker presented Verigram's approaches to document recognition: computer vision and machine learning.

 

Document recognition methods

There are two classes of software for OCR data capture depending on the complexity of the documents.

The first method is a classic one, which is simple and easy to implement. This approach is especially good for Kazakh ID documents. As for the second method, it is a great solution when there are piles of text in the margins. Machine learning may seem to slow down text recognition compared to the classic method. This is not the case. Both approaches work well in the web and mobile versions of the solution.

Document recognition CV

OCR technology is a universal tool. This solution automates manual processes of retyping data from documents, reduces manual input errors to zero, and integrates with any business process. For example, the Verigram solution instantly recognizes all international passports, driver's licenses and a number of national documents. 

 

The result of an intelligent document recognition solution is the automatic export of data and images to a business process/workflow or any downstream system. This information is immediately available for action. Based on this information, the business can then take action to improve the customer experience or immediately stop the process if fraud is suspected.

 

Ainur was impressed by the number of students in the audience who were interested in the topic. She said: "More and more students are interested in studying this topic and it's great that we can share our expertise”. In addition, the speaker highlighted the Optical Character Recognition process that is an integral part of Veridoc. This technology is essential for extracting relevant information from documents and reduces the risk of human error during the verification process.

 

Ainur Zhappasova, ML Engineer

 

Engineers plan to improve current developments in glare detection and image blur detection. These improvements will be a great addition to the existing technologies and will reduce the risks associated with improper data extraction.

 

The speaker thanked Terricon Valley for the opportunity to share her expertise with the audience. It was her first public sharing experience and she was amazed by the level of organization of the event. We hope to participate in such activities in the future.


 

Stay up-to-date with the latest news and updates