Font Size: a A A

Research On Script Identification Of Printed Document Images

Posted on:2008-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:X C LuFull Text:PDF
GTID:2178360242972334Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the fast developing of network communication technology and multimedia technology, document images have been widely applied to many fields. How to extract document images effectively from massive distributed information systems, and transform to electronic documents, automatic analysis of document images has becomes a pressing issue.As a main sub-discipline of automatic analysis of document images, automatic script identification of document images has become a hot research topic. In this paper, printed script identification of document images is studied and proposed.The major work implemented in this paper is presented as follows:1. A dynamic threshold text line projection algorithm is proposed. Compared to text line projection algorithm, it reduces the infection of font strokes' width, and the recognition rate of Chinese script and alphabet script(involve English and Russian) is raised.2. Extract texture features from document images using multi-channel Gabor filters. In this paper, we implement script identification using SVM to structure classifier. For document images of noise, tilt, and print defects, it has a a strong robust.3. On the foundation of implement script identification by using wavelet transform, we structure an improved script identification algorithm which uses wavelet logarithmic energy feature. The experimental results show that the recognition rate of Chinese, Japanese, Korean, English, Russian and Arabian are satisfactory.4. A script identification algorithm based on fractal model is proposed. This algorithm considers document images as multi-fractal sets, and structures a multi-fractal model to calculate the multi-fractal dimensions, then implements script identification using the multi-fractal dimensions. It is effective when the document images have different script size and different text line spacing.
Keywords/Search Tags:document images, script identification, projection, Gabor transform, wavelet transform, fractal dimensions
PDF Full Text Request
Related items