Font Size: a A A

Research Of Typeset Mathematical Expressions Recognition

Posted on:2006-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:F H LiFull Text:PDF
GTID:2168360155450342Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet, it is important to convert information source from printed form to electronic form. There are lots of mathematical expressions in scientific documents, but it is difficult to enter mathematical expressions into the computer by hand, this makes the technology of OCR important to recognizing mathematical expressions in these scientific documents. Although the technology of OCR has a satisfying recognition effect for Chinese and English characters as well as digital symbols, it is not efficient very much in recognition of mathematical expressions. Because of two-dimensional features of mathematical expressions and the variety of meanings of mathematical expressions, it has great difficulty in segmentation and structural analysis of mathematical expressions. So the research for recognition of mathematical expressions becomes a hot topic in OCR area. Recognition of typeset mathematical expressions is a complex process. This process can be divided into three steps: extraction of mathematical expressions, analysis and recognition of mathematical expressions, and structure reconstruction of mathematical expressions. Analysis and recognition of mathematical expressions in this process has an important effect on structure reconstruction of mathematical expressions; it dominates the efficiency of mathematical expressions recognition, and therefore is crucial in mathematical expressions recognition system. In this paper, we do a quantitative research in analysis and recognition of mathematical expressions phase, as follows: In symbols recognition, an algorithm of symbols segmentation by connected areas which improves the accuracy of symbols segmentation is put forward, and a method of symbols segmentation based on the feedback result of symbols recognition is applied to segment the conglutination symbols in mathematical expressions successfully. In structural analysis, to the layout of mathematical symbols, we apply a method which connects the way of "Top-Down"with the way of "Bottom-Up"to analyze the structure of mathematical expressions; this is based on the result of symbols recognition. In experiments, this method has shown favorable adaptability for the structure of typeset mathematical expressions.
Keywords/Search Tags:OCR, Mathematical formula recognition, Symbols recognition, Structural analysis, Formula structure reconstruction
PDF Full Text Request
Related items