Font Size: a A A

Handwritten Chinese Text Recognition Based On Deep Convolution Model

Posted on:2020-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:G C ShangFull Text:PDF
GTID:2428330599960446Subject:Engineering
Abstract/Summary:PDF Full Text Request
Text recognition refers to text position prediction and text content analysis,and the handwritten texts studied in this paper include handwritten numbers and handwritten Chinese characters.The traditional handwritten digit recognition methods include support vector machine,nearest neighbor and random forest,but the handwritten digital texture features are few,the effective information extraction is difficult,and the accuracy of the above classifier is not high.Traditional handwritten text recognition is mostly based on single characters.The recognition methods for text lines are rare,and the implementation is limited to patchwork of image preprocessing,character segmentation,feature extraction and classifier design.In short,traditional text recognition methods and models generally use shallow features,failing to abstract common features from large-scale data,and the results are not satisfactory.In view of the above problems,this paper expounds and analyzes the difficulties and key technologies in handwritten text recognition,and proposes effective solutions and verifies them through experiments.The main research contents are as follows:(1)A handwritten digit recognition method based on improved VGG16 convolutional network is proposed.The learning rate annealing algorithm is integrated into the SGD optimizer to optimize the network learning process.The recognition accuracy is improved to 99.98% on the enhanced MNIST dataset.(2)The RRPN network is used to solve the problem of extracting candidate regions of tilted text lines.The RRCNN network is used to realize the detection and regression of the tilted text lines.Finally,the BLSTM network is integrated into the network to realize the precise positioning of start and end positions of the text lines.(3)Aiming at the problems of diverse style and stroke adhesion of handwritten Chinese character,a new method of end-to-end text line recognition without segmentation is proposed.The DCN network is used to extract the text line feature sequence,and the MultiBLSTM network is used to learn spatial context information of the text.The text line feature sequences are classified by CTC layer and the N-gram language constraint model is combined to obtain the text result,which avoids the difficulties of image preprocessing and characters segmentation.The accuracy of 92% is obtained on the handwritten text dataset HWDB2,which proves the superiority of the mode.Finally,the application case of the research content in the field of answer sheet recognition is given,and the ideas and solutions for the automatic identification of the answer sheet are proposed.
Keywords/Search Tags:Handwritten digit recognition, Learning rate annealing algorithm, Tilted text line positioning, Text line recognition without segmentation
PDF Full Text Request
Related items