Font Size: a A A

The Study Of Mathematical Formula Extraction With The Script Identification

Posted on:2012-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:B ChenFull Text:PDF
GTID:2298330338995366Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The extraction of mathematical formulas is the key step in printed mathematical formula recognition technology. When the extraction algorithm that it’s designed for one language is used to another language, the effect will be worse for the differences between two languages. Therefore, it is necessary to introduce language identification of documents before the extraction to get better extraction results.Because the fluctuation of spacing between the connected domains in the English text line is volatile than Chinese text, and there is little affection from the mathematical symbols, the spacings between adjacent connected domains in the text line are calculated to realize the classification of English and Chinese scientific literature which contain mathematical formulas, select adaptive algorithm to extract the mathematical formulas on the basis of language category; and design the fast correction function to the error in the extraction results caused by the image and the algorithm, under the guidance of the user to realize the fast correction. The results show the availability of the designed algorithm.
Keywords/Search Tags:Printed mathematical formula recognition, Formula extraction, Discrimination between the English and the Chinese, Spacing between connected domains, Fast correction
PDF Full Text Request
Related items