Font Size: a A A

Research On Key Technologies Of Printed Mathematical Expression Recognition

Posted on:2021-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:J P ChenFull Text:PDF
GTID:2428330611499770Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of computer technology and the popularity of the Internet,electronic texts have gradually become one of the main ways for people to obtain information.However,some electronic materials are stored in the image format,and it is difficult to retrieve and reuse these materials.Due to the development of printed text recognition technology,most of text information saved in the image format can be converted into an editable text format.Because mathematical expressions have complex two-dimensional structures and flexible expressions,they cannot be completely converted into text information accurately.At present,conventional mathematical expression recognition methods can be divided into three stages: character segmentation,character recognition and structure analysis.However,the errors of the previous stage are often passed to the next stage,resulting in performance degradation of mathematical expression recognition.For solving this problem,a method for recognizing the printed mathematical expression based on global information is proposed in this dissertation.The interaction between character segmentation,character recognition and structure analysis is fully considered in this method.Besides,this method utilizes the context information and the syntax information of the formula to realize the recogni tion of the mathematical expression.The mixed character segmentation method based on the merger strategy is provided to segment the characters,which effectively solves the problem of over segmentation in multi-connected domain character segmentation.In terms of character recognition,an improved Le Net-5 network recognition model is present.The structure of the network recognition model has been adjusted for the specific problem of mathematical expression symbol recognition.At the same time,the parameter settings of the network are also optimized.Then,a mathematical character recognition model with fast training rate,high recognition rate and strong generalization ability was obtained.In terms of structural analysis,this dissertation analyzes the geometrical characteristics of mathematical formulas and uses the central deflection angle as a structural feature to establish a discriminant model for spatial relationships between characters.And then,the structural analysis of the geometric spatial relationships of the mathematical expression is provided in this dissertation.From the aspect of grammatical semantics,a two-dimensional context-free grammar that can express the structural relationship of the most common mathematical expression is developed.Through the grammar method,the original mathematical expression recognition problem is transformed into the problem of constructing the largest possible structural analysis tree of multi-information fusion which contains character cutting,character recognition,geometric structure analysis and grammar analysis Therefore,the result with the highest recognition probability of the mathematical expression can be obtained according to the context information,and the recognition accuracy of the printed mathematical expression is effectively improved.
Keywords/Search Tags:mathematical expression recognition, convolutional neural network, structural analysis, two-dimensional context-free grammars
PDF Full Text Request
Related items