Font Size: a A A

Research And Application Of Optical Character Recognition For China Second-generation Identity Card

Posted on:2021-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:F LiFull Text:PDF
GTID:2428330602476679Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the widespread use of the Internet,how to effectively carry out information authentication has become a key issue,ID information authentication is a part of it.Optical character recognition of ID cards is one of the methods to quickly obtain ID information.The first step of optical character recognition for ID card images is to pre-process the images,good pre-processing makes the final ID card images layout clean and easy for subsequent processing.In this paper,image preprocessing research is performed on images taken by mobile phones in different environments,including image denoising,image enhancement,and image binarization.Finally,a set of adaptive preprocessing algorithm processes is derived.The next thing to consider is the detection and segmentation of text lines in the images,this paper divides the captured images into complex background images and simple solid-color background images.Because text line detection in a complex background based on deep learning relies on the GPU,in order to save computing resources and make text detection run under CPU conditions,this paper studies text line extraction based on a simple background,which is significantly faster than text line extraction under complex background while running on the CPU,the disadvantage is that there are restrictions on the background when taking photos,all lines of text in the ID card images can be detected in both ways.Text line segmentation uses the traditional projection method,plus a posterior method proposed in this paper,which effectively avoids the incorrect segmentation of Chinese characters and can segment them into single-character images well.The last step is the recognition of single-character images.In this paper,convolutional neural networks are used for single-character recognition.The training data is generated by code,and the verification data is obtained by segmenting the real ID card images.In order to solve the problem that the numbers in the segmentation process are easy to stick,a single-character image composed of multiple digits is generated to identify multiple digits.There are a total of 11,820 characters used to generate images,and massive single-character images generated by the code are incrementally learned in batches.The network selected for the training data is ResNet-34.The network is reduced,the parameters are adjusted,and the incremental learning has obtained a good recognition effect on the verification machine.The recognition test of multiple ID cards is also accurate.
Keywords/Search Tags:ID card recognition, optical character recognition, image pre-processing, text lines detection, text lines segmentation, convolutional neural networks, ResNet
PDF Full Text Request
Related items