Font Size: a A A

Orientation And Analysis Of Mathematical Expressions In Document Images

Posted on:2008-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:B ChenFull Text:PDF
GTID:2178360218450478Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Since the electronic document has advantages such as its convenience in revision, retrieval and transmission, the real-time transformation of the traditional paper document to its electronic version based on the mobile office terminal becomes more and more frequent. Two processes of document image segmentation and character recognition are needed to realize the transform. Usually, many elements such as words, images, graphics, forms and mathematical expressions are contained in document images. Difficulties of the electronic transformation and document reusing lie in the analyzing, recognizing and rebuilding of the mathematical expressions. To tackle the above problem, it is necessary to develop efficient algorithms for the electronic transformation. Contributions of this thesis are as follows:Owing to the existence of the self-correlations among foreground pixels in different regions of the document images, a page segmentation algorithm based on the microstructures is proposed in this thesis. Firstly, the foreground pixels in the document image are classified into different microstructure sets with a fast scanning algorithm, and the elements such as the halftones and forms are classified with the correlations between the microstructures. Then, the rest of the character structures are merged by changed rules. The skew angle of the document image is detected from the largest merged character region with the least square algorithm, after which the skew correction is performed on the document image with the detected skew angle. Finally, the de-skewed document image is segmented into text lines by means of the horizontal project profile in combination with the microstructure.There are large differences between ordinary text lines and the lines containing mathematical expressions because of the two-dimensional structure of the expressions. In this thesis, the independent expressions lines are separated from the ordinary text lines according to the above differences. Then the in line expressions contained in the classified text lines are located according to the relationships between the connected components and the upper and lower base lines, after which, the in line expressions are segmented with the method of the maximum projection gap. Lastly, the mathematical expressions are analyzed with the microstructure and the projecting method. Experimental results show that the proposed algorithm is effective and also has superiorities of stability and flexibility. In addition, the most prominent superiority of this method lies in that the document elements are classified and decomposed one by one through structural classification, which can improve the recognition performance. Therefore, the OCR for the decomposed elements will also benefit from it.
Keywords/Search Tags:Document elements, Microstructure, Image segment, Baseline, Expression location, Connected Components
PDF Full Text Request
Related items