Font Size: a A A

Research On Offline Handwritten Mathematical Expression Recognition Algorithm Based On Encoder-Decoder

Posted on:2022-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:J HuFull Text:PDF
GTID:2530306326973579Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Mathematical expression has been widely used in many fields,such as scientificresearch,economics and statistics.In these fields,the paper typesetting system(e.g.,LaTex)and exprssion editing software(e.g.,MathType)are employed to input to electronic devices.These methods,however,require the user to master a large number of syntax rules to edit mathematical expression.Recently,another method has emerged for users to write mathematical expression on the writing device.This makes it easier to edit scientific documents in situations where mathematical expression are heavily used.Therefore,the need of automatic recognition of handwritten mathematical expression is increasing prevalent.The development of smartphones and other handwriting input devices is also driving research in this area.Despite the existing identification approachescan better recognize mathematical expression,there are still some deficiencies.This thesis has made improvements to these deficiencies,and the main contributions of ourwork are as follows:(1)We propose a string decoder with dual attention modules to deal with the attention drift phenomenon in the decoding process of existing models.Coverage attention module is employed to introduce historical alignment information,while position attention module is used to introduce decoding position information.And a dynamic fusionmodule is added to realize the adaptive fusion between the two attention modules.Experimental results show that the proposed decoder structure can effectively alleviate the phenomenon of attention drift and improve the recognition performance of the model.(2)We add a center mask detection module to introduce the center mask as additional monitoring information to deal with the problem that the model learns the wrong classification because it could not learn alignment,learning alignment and classification together.Since the whole model is trained together,the module allows the encoder to encode better deep features.Moreover,the decoder’s attention mechanismcan be guided to pay attention to all existing mathematical symbols and focus as much as possible on the center of the symbols.By comparing with the current main stream identification methods,the proposed method achieves higher recognition rate on both CROHME2014 and CROHME2016 test dataset.(3)We proposes a new data augmentation method to deal with the problem of lack of training data for handwritten mathematical expression,which generates new expression samples by randomly replacing symbols.By applying the data augmentation method proposed in this thesis in different models,the recognition performance of each model has been improved.
Keywords/Search Tags:Encoder-Decoder, Handwrite Expression Recognition, Attention Mechanism, Data Augmentation
PDF Full Text Request
Related items