Font Size: a A A

Studies In Page Segmentation And Classification Technologies For Document Images

Posted on:2005-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiFull Text:PDF
GTID:2168360125966302Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Page segmentation is one of the most important research subjects in the domain of page layout analysis. It plays a key role in the retrieval and storage of page document. The main purpose of page segmentation is to classify a digital document image into text regions and non-text regions, so as to make text regions identifiable by OCR (Optical Character Recognition) system and capable of being transformed into electronic version. Page segmentation is also useful to optical character recognition, image compression and image storage system. A lot of studies have been carried out in this subject, and much achievement has been obtained. However, owing to the complexity of document pages, different algorithm is only valid for segmenting its correspondent page edition.Two kinds of segmentation algorithms motivated by those presented in the literatures are proposed in this thesis.The first proposed algorithm is based on the Gaussian mixture model. Gaussian mixture model is used to describe the distribution of different texture regions. Based on this model, each pixel is classified by the maximization of the likelihood. This proposal can not only train parameters in a faster speed, but also endure page images with text and halftone.The second proposed algorithm is based on pattern-list analysis and is an improved version of that appeared in Optical Engineering, Vol.39(3), pp.724-734, March 2000. This algorithm transforms a binary page image into a pattern-list, and then classifies each pattern by their features. The improved steps are as follows. Firstly, in the process of pattern classification, only one feature (MAXBRL) is used and a similar perfect classification is achieved.Secondly, in the context classification, only a few patterns intersected with those large sized halftone patterns are classified again. This improved version makes the whole algorithm faster and easier to be realized, however, exhibits excellent performance for complex document images. In addition, a method for extracting text from forms is also proposed in the thesis and proved very effective.
Keywords/Search Tags:Page Segmentation and Classification, Maximum Likelihood, Gaussian Mixture Model, Pattern-list Analysis
PDF Full Text Request
Related items