Font Size: a A A

Key Technologies Of Handwritten Character Recognition For Chinese Examination Paper

Posted on:2021-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y R ZhangFull Text:PDF
GTID:2428330602984000Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Handwritten character recognition refers to translating text in picture into computer-editable text.It is of great significance to recognize handwritten character by utilizing computer technology for preserving and using text information.Due to the variety of character in the examination paper,the complexity of the handwritten Chinese character structure,the inconsistency of the candidate's font,and the rigor of educational issue,handwritten character recognition oriented to education has higher recognition accuracy requirement.At present,the character recognition technologies based on deep learning have made good progress,but the research on handwritten character in the field of education is still in its infancy.The purpose of this thesis is to research and implement key technologies for handwritten character recognition for Chinese examination paper,realizing the digital storage and utilization of answer card content,promoting the process of intelligent marking,and further advancing the intelligent development of education.Aiming at the requirement of higher recognition accuracy in the examination scene,this thesis mainly adopts the single-character recognition model.On the basis of segmenting the answer card characters,different convolutional network models are designed,mainly including handwritten digit recognition model and handwritten Chinese character recognition models.Finally,satisfactory recognition accuracy is achieved on the 3768 classes character of two real test datasets.The main contributions of the thesis are:(1)For the segmentation problem in single-character recognition,this thesis researches and implements handwritten Chinese character segmentation for examination paper,designs dynamic line segmentation based on projection and uses local minimum search algorithm to find the segmentation trajectory between adjacent text lines,then segments the answer card into multiple pictures containing only one line of characters;For single-line text,this thesis implements over segmentation based on Viterbi algorithm,constructs a hidden Markov model of character images,finds non-linear segmentation paths,and then uses heuristic rules to delete redundant segmentation lines.The A*search algorithm is carried out to find the segmentation path combination which has the minimal cost value,then realizes the mergence of over segmentation path to split the text line into multiple pictures containing only a single character finally.(2)For handwritten digit character recognition,a recognition model based on convolutional neural network is designed to recognize candidates' examination numbers,seat numbers,and handwritten digits in various exam paper,achieving a recognition accuracy of 99%on real test dataset.(3)For handwritten Chinese character recognition,different recognition models based on convolutional neural networks are designed to achieve high-accuracy recognition of 3755 classes of handwritten Chinese characters,12 classes of punctuation,and crossed-out characters in the examination answer card.They mainly include:the improved model based on AlexNet,which using a small-scale convolution kernel suitable to extract handwritten character features,building a multi-layer convolution neural network structure,and randomly augmenting character images in various ways to enhance sample diversity;the recognition model based on binary classification,which introducing the prior knowledge of character images on the basis of convolutional neural network learning,improving the accuracy of punctuation mark recognition by using the binary classification probability of Chinese punctuation;the modified loss function model,which adding the cosine and angular margin to the softmax loss function to highlight the differences between different classes of characters as much as possible and increasing the accuracy of character classification.Finally,in two real test datasets,we achieved 94%recognition accuracy for 3768 categories of characters,and the effect are better than the comparison methods.
Keywords/Search Tags:deep learning, handwritten Chinese character segmentation, handwritten Chinese character recognition
PDF Full Text Request
Related items