Optical Character Recognition And Application Research Based On Machine Learning

Posted on:2019-07-19

Degree:Master

Type:Thesis

Country:China

Candidate:C H Yang

Full Text:PDF

GTID:2428330572456407

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

Optical character recognition(OCR)technology is an important branch of machine vision.It involves pattern recognition,image processing,digital signal processing,artificial intelligence and other disciplines.It is also an integrated technology and has important application value in high-tech fields such as text information processing,office automation,machine translation,and real-time monitoring systems.Entering the 21 st century,with the popularity of smart phones which have high-definition cameras,the development of OCR has a new pursuit: more and more people use their phone to shoot the things and scenes they see,and get the text information in the pictures.So recognition of text in natural scenes has become a new topic.Compared with recognition of scanned images,it's more difficult to recognize the character taken by the mobile phone because of the problems such as blurring,geometrical distortion,uneven light,and complicated background in the pictures.The research focus of this paper is image preprocessing algorithm and character recognition algorithm.It also detects and recognizes the ID image captured by mobile phone.The work of this article mainly includes the following aspects:(1)Detailed description of commonly used image preprocessing algorithms and character segmentation algorithms,as well as their respective advantages and disadvantages.The preprocessing algorithm includes image binarization and geometric correction.About image binarization,using the combination of local threshold and global threshold method,the global threshold method is used to obtain the outline information of the image before the geometric correction,and the local threshold method is used for the character regions in the image.Experimental results show that the binarization method combining local threshold and global threshold is fast and has good binarization effect on the character regions.About geometric correction,an image distortion detection method based on block Hough transform is proposed.And after detecting the image distortion angle,the tilt angle and perspective angle of the image are calculated and then corrected.The experimental results show that the detecting image distortion angle based on block Hough transform is fast and the effect of geometric distortion correction is good.About the character segmentation,after segmenting the character by vertical projection method,this paper checks the character segmentation by recognizing feedback,and then re-segment the wrongly-divided character.Experimental results show that sticking characters and the Chinese character which is left-right separated structures can be correctly segmented by checking and re-segmenting.(2)The convolutional neural network is introduced in detail,and a data set of 6742 characters is constructed based on the character characteristics on the ID card.Then constructing different layers of convolutional neural networks,training and testing the character recognition performance of the convolutional network model iteratively.The experimental results show that the 4-layer convolutional neural network has the best recognition effect after training 15000 times.(3)Finally,according to the characteristics of China's second-generation ID card,an identity card identification system was constructed to detect and recognize the name,gender,address,and ID number of the ID card.The experimental simulation proves that the ID identification system is fast,and the recognition rate is high.The recognition accuracy rate can reach 99.6%.

Keywords/Search Tags:

optical character recognition, Hough transform, geometric correction, machine learning, Convolutional neural network

PDF Full Text Request

Related items

1	Design And Implementation Of Machine Understanding Mathematical Geometric Diagram System
2	Research On Object Detection Based On Hough Transform In Complex Scene
3	Research On Industrial Character Recognition Method Based On Convolutional Neural Network
4	The Research Of Optical Character Recognition Orient Digital Resource Aggregation Platform
5	Research On Algorithm And Application Of Deep Learning Based On Convolutional Neural Network
6	Research On Vehicle License Plate Automatic Recognition
7	Research And Implementation Of IC Chip Surface Character Recognition Algorithm
8	Research For Geometric Distortion Correction Technology Of Text Image
9	Research On Convolutional Neural Network-based Scene Text Detection And Multi-orientational Character Recognition
10	Character Recognition And Localization Of Engineering Drawing Component Based On High Order Convolutional Neural Network