Font Size: a A A

Research On Algorithms Of Document Image Processing And Form Image Identification

Posted on:2011-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2178330332978670Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of communication and information processing technology, more and more paper documents are transformed to document images, and are transmitted via internet, satellite and fax communications. Then, document images have become an important source for acquiring information. However, the most previous document images processing systems of applications are only available for specific species, and can not satisfy the requirement of universality and real-time. So, the study of document image analysis and processing is of great value. Based on the proposed algorithms, this thesis mainly works on the document image recognition, the document image analysis and the form document image recognition. The main work is as following:1. Based on the comparision of document images with the ordinary images bearing continuous hue for the data distribution features in spatial domain as well as in Radon transform domain, a document image classification and retrieval algorithm based on picture information measuring and Radon transform is constructed. Experimental results show that the algorithm can greatly reduce the misclassification rate.2. An improved algorithm about skew detection of the document image based on multi-resolution Hough transform is given, which treats the different resolution images with different Hough transform accuracy. Experimental results show that this algorithm is effective and accurate in skew detection of document images.3. In the study of layout analysis, an algorithm based on connected component analysis is proposed to segment images effectively. Then, spatial domain features and SVM are presented to classify text, form, graphics and image regions respectively. The experiments confirm the accuracy of the proposed algorithm.4. This thesis makes an in-depth analysis on classification of document image, and a form document image recognition algorithm is proposed based on the line projection and point features of forms. The document image segmentation and form document image recognition are merged to recognize the form document image. Firstly, the document image is segmented into different regions respectively. Then, the regions are classified to form region and the others through extracting line projection and point features of each region. Experiments demonstrate that the algorithm can recognize form document image accurately and effectively.
Keywords/Search Tags:document images, Radon Transform, mathematical morphology, skew detection, Hough Transform, layout analysis, form image recognition
PDF Full Text Request
Related items