Font Size: a A A

Documents Image Segmentation Methods Based On Block Extraction And Binarization

Posted on:2012-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:L QuFull Text:PDF
GTID:2178330335451021Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the computer technology and the rapid development of communication technology, modern society has gradually entered into the information age. The traditional record information, storing the information carrier (such as paper, etc.) has been unable to meet the people living and working in the day, produced a large number of people in the communication of documents and the large amount of information generated in the process. These documents into electronic documents for communications and storage have been imperative. OCR system is now mainly used for electronic recording of this information, but some of the more complex the document image is difficult to accurately identify the information directly out of the need for such a series of processing the document before its entry. Which image segmentation theory as digital image processing has become an important part of people active research focus. Image processing document image segmentation theory is an important research topic in the process it is mainly between the document image pre-processing and advanced character recognition an important link between. Back in the '80s, a lot of the literature are based on the complexity of the document put forward the image of the page a different approach. Not the same with other areas is due to the basic theory of each algorithm is different from the methods used are also very different, most of the methods are for a particular class or a document of several pages with obvious features of the segmentation, Can be difficult to find a document for all types of good image segmentation method. The relatively effective and commonly used for document image segmentation and classification methods include threshold, and geometric analysis and other categories.In this paper, the second module is based on extraction and binarization method will combine the theory of image segmentation. The first two modules extracted by complex document image page independent background intensity of the modules are extracted, and then gray value within the modules to determine their individual background gray value. The other part we have not been extracted using a classical binary method-Otsu global threshold to its binarization. This method uses a combination of a variety of segmentation methods, while ensuring the speed greatly improved the accuracy of image segmentation, the resulting document image binarization results are very satisfactory. To test this method, this paper selected the appropriate document 300 of the binary image processing. Will get results and other methods were compared. The results show that the proposed maximum accuracy of this method, the speed is very fast. Especially when dealing with the complex structure of the document page images, the advantages of this approach is obvious.
Keywords/Search Tags:Segmentation, binarization, block extraction, Otsu global threshold method
PDF Full Text Request
Related items