Font Size: a A A

Research On Recognition Technologies Of Printed Mathematical Expression

Posted on:2015-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZongFull Text:PDF
GTID:2268330422471742Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With rapid development of computer technology and wide wide spread of networks,the electronization of books and literature has become an important task. At present,OCR technology can efficiently recognize Chinese and English characters andnumbers, so most of information can be electronized through OCR technology. But therecognizatio of mathematical expression can not obtain satisfactory results. The reasonlies in that, compared with normal text mathematical expression is of two-dimensinalstructure and it is difficult to determine the logical relationships between characters.Toanalyze and recognize mathematical expression, not only need to segement andrecognize single charcter but also need to analyze the overall structure of mathematicalexpression.The thesis research on the important process of the mathematical expressionrecognition and realize it. After inputing mathematical expressions, throughpreprocessing, character segmentation, character recognization and structure analysis,at last we thranlate result into Latex texts. Specific work of this thesis is as follows:1)Image preprocessing. Inorder to meet the needs of subsequent processing,weshould eliminate redundance and various interfere of image through imagefiltering,binarization, and Image thinning,etc2)Character segmentation. On the basis of existing algorithms, the thesiscombines connection region method and projection method. Eventually, every singlecharacter is extracted and the spatial coordinates is determined.3)Character recognition. This thesis ueses the template matching method basedon feature extraction to recognize every character. Character recognition is comprise ofcharacter feature extraction and character database comparison.4)Structure analysis. This is the key process of mathematical expressionrecognition. This thesis proposes a method called Partitioned Tree Transformation.Firstly, this method classify mathematical expressions and divides the expression intoseveral sub-modules.Then every module is processed.On the basis of charactersegmentation and character recognition, the thesis get all the structural information ofmathematical expression through the analyzing spatial relation of characters.5)Expression display. First we introduce the discription methods ofmathematical expression, and then use Latex as the language to display mathematicalexpressions.
Keywords/Search Tags:Mathematical Expression, Character Recognition, Structure Analysis, Latex
PDF Full Text Request
Related items