The extraction of mathematical formulas is the key step in printed mathematical formula recognition technology. When the extraction algorithm that it’s designed for one language is used to another language, the effect will be worse for the differences between two languages. Therefore, it is necessary to introduce language identification of documents before the extraction to get better extraction results.Because the fluctuation of spacing between the connected domains in the English text line is volatile than Chinese text, and there is little affection from the mathematical symbols, the spacings between adjacent connected domains in the text line are calculated to realize the classification of English and Chinese scientific literature which contain mathematical formulas, select adaptive algorithm to extract the mathematical formulas on the basis of language category; and design the fast correction function to the error in the extraction results caused by the image and the algorithm, under the guidance of the user to realize the fast correction. The results show the availability of the designed algorithm. |