Font Size: a A A

A Method Of Locating Mathematical Expressions In Chinese Printed Document

Posted on:2010-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:X F ChangFull Text:PDF
GTID:2178360302461963Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The recognition system of printed mathematical expression mainly consists of four components, namely, mathematical expressions location, mathematical expressions symbol recognition, mathematical expressions structural analysis and mathematical expressions reconstruction. As the first step of mathematical expressions recognition, the mathematical expression location is the research focus of this thesis.The mathematical expressions in scientific and technical documents can be categorized into isolated mathematical expression and embedded mathematical expression. Aiming at the features of Chinese documents, the thesis presents an approach of mathematical expressions locating which based on the combination of decision tree and BP neural network. The approach locates isolated mathematical expression and embedded mathematical expression respectively. It analyzes the property of text lines with ID3, formulates the decision tree, and locates the isolated mathematical expression in documents by generated rules; it also used to locate the embedded mathematical expression by picking up the features of text lines beyond the isolated mathematical expression and training of BP neural network. The experiments show that the approach has high accuracy, fault tolerance and velocity to the mathematical expressions location in Chinese printing documents.
Keywords/Search Tags:OCR, Mathematical expressions recognition, Mathematical expressions locating, Decision tree, BP neural network
PDF Full Text Request
Related items