Font Size: a A A

Research And Implementation On Printed Mathematical Formula Recognition

Posted on:2013-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q YuFull Text:PDF
GTID:2218330371960712Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
At present, OCR technology is rapidly developing, accurately, which can make electronic books. Although the effect of OCR technology to distinguish digital and word and the recognition is very good, but the mathematical formula's recognition effect is not ideal, the reason is that the mathematical formula has a complex structure, the logical relationship between the various characters is also complex. The correct identification of mathematical formula needs not only the recognition of a single mathematical symbols, but also the formula structure analysis of accurate.This paper designs a mathematical formula recognition system, it achieve recognition function of printed mathematical formula. The system inputs in the form of pictures, after image preprocessing, character segmentation, character recognition, structure analysis and others'processing operations, in the result, it will be converted to Txt format's text form to output. Characters'segmentation uses circular projection segmentation method and region segmentation method, firstly using the vertical and horizontal projection circularly project to make formula to be divided into sub character blocks, secondly using domain method separate blocks of character segmentation which projection method can not separate, the effect of this hybrid segmentation method is better than the effect of the single method. On the character adhesion part it comes up with segmentation method which is based on rectangular frame, using the width and height in the library of characters to separate the adhesion characters, and having the validation step after the segmentation, thereby reducing the error segmentation probability. In the structural analysis part it come up with analysis method of character block encoding combined with constructing bifurcation tree, this method orderly codes according to the vertical and horizontal section method in the formula segmentation, it laid the foundation for structure analysis. In the structure analysis phase using the coding and bifurcation tree to analysis reconstruction to the various blocks of character.In this paper, analysising a variety of methods which is used in the mathematical expression recognition phases, and realizing the methods by coding in the end giving the results. Through the experiment, this method can separate effectively and distinguish individual character and adhesion character, through identification and structural analysis, finally to output in the form of text.
Keywords/Search Tags:formula recognition, adhesion character, segmentation coding, bifurcation tree
PDF Full Text Request
Related items