| With the development of computer technology,informatization,digitization and intelligence have become the trend of current era and social development.In many mechanized production scenes,machines have been able to replace human to complete some complex tasks,greatly simplifying people’s life.Especially in the field of examination,intelligent marking based on handwritten text recognition technology has become an inevitable requirement for the development of the field of examination and evaluation,by directly transcribing the scanned images of handwritten answers on the answer sheet into text,it can facilitate the subsequent storage and analysis,and realize the informatization and intelligent management of examinees’ examination papers.Although the research on handwritten text recognition has made remarkable progress,however,due to the diversity of handwritten characters,the lack of relevant datasets and the rigor of exams,there are many challenges in paper handwritten text recognition.Therefore,at present,facing the field of education,the research on paper handwritten text recognition based on deep learning is still immature.Considering the above problems,this thesis is oriented to English text,aims at the intelligent recognition of image and text data to research and realize the core technology of handwritten English character recognition for examination papers by deep learning methods,and successfully applies it to handwritten English recognition in real test papers,it can promote the intelligence of marking process and the long-term development of education.Specifically,because the examinees have different writing scenes when answering,this thesis carries out two kinds of paper handwritten English recognition methods based on the word unit and the line unit according to the different length of English sequence,and designs two different recognition models respectively,on this basis,a handwritten English recognition system is constructed for the examination application.(1)For the English word recognition problem,this thesis proposes a word-based paper handwritten English recognition model.Firstly,a specially designed visual feature encoder is proposed to capture the rich timing and context information in the visual representation.Secondly,in order to explicitly model the rich semantic information carried in words,a semantic modeling module guided by language prior is proposed,in this module,firstly,a semantic prediction module is proposed to deduce the potential global semantic information in visual features,and then,a semantic supervision module is proposed to obtain the semantic enhanced global representation under the prior supervision of the pre-trained language model.Then,a semantic guided decoder is proposed in order to effectively utilize and integrate the features of both visual and semantic modes.Finally,in order to correct the initial prediction error of the model,a dictionary error correction post-processing method is proposed,which can make compatible matching between language and vision,and improve the recognition accuracy.(2)For the English line recognition problem,this thesis proposes a line-based paper handwritten English recognition model.Firstly,in order to avoid the sequential delay of recurrent neural network,a recognition architecture of fully convolution is proposed.In order to improve the operation efficiency of the model,efficient convolution is proposed to replace the standard convolution,which can effectively reduce the number of parameters and computing overhead.Secondly,in order to extract features in a sufficient and deep way,a stacked block architecture is proposed to obtain more efficient and robust visual representation through a novel gate module.Then,in order to maximize the use of finite datasets and realize accurate recognition,a statistic decoder is proposed to predict the number of characters and assist the Connectionist Temporal Classification(CTC)decoder to constrain the transcription process.Finally,a statistic loss function is proposed to obtain more reliable recognition results combined with CTC loss.(3)To evaluate and verify the performance of the two recognition models proposed in this thesis,many experiments are performed on the corresponding two datasets.The results show that the methods proposed in this thesis can achieve advanced performance in real test scenes,and also prove the effectiveness of each module in the model.(4)On the basis of model verification,this thesis constructs a handwritten English recognition system for the application of examination.By calling the two recognition models,the scanned images of translation and composition questions in the uploaded English papers are transcribed into text,which can visually examine the answers. |