Font Size: a A A

Research On Text Recognition In Natural Scene And Realization

Posted on:2011-07-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:R WuFull Text:PDF
GTID:1118360332957944Subject:Artificial Intelligence and information processing
Abstract/Summary:PDF Full Text Request
The texts in the image can be used as the vital clue to understand image content. There are rich texts containing important information in natural scenes, which can provide very valuable help to understand the scene. Thus, an automatic tool developed to recognize text from natural scene images is of great value to the image retrieval, analysis and scene understanding.Despite the traditional document analysis technology has made great achievements, it can not deal effectively with text in natural scene. There is obvious difference between text images of natural scene and traditional document, such as the inconsistent of color, brightness, contrast; the changeable and complex background where the texts embed; the deformation, incomplete, blur, and fracture of text image; the strong noise which contains, and so on. These make text recognition from natural scene face many challenges.This dissertation studies the related techniques for text recognition from natural scene, with the emphasis on the image distortion correction, the image segmentation in the complex background, and the low-quality character recognition. Details are as follows:1. There is much perspective distortion in the text images of natural scene, which will degrade seriously the overall performance of the text recognition. This dissertation studies the perspective distortion of the images and presents a method of distortion rectification based on the vanishing point, for that there is few text lines in scene images lack of paragraph information or the edge of document is incomplete. The method consists of two procedures:1)Detection of the vanishing point: The vanishing point is the intersection of the text baselines, which can be found by detecting the line corresponding in parameter space. Firstly, the endpoints are extracted from text characters using mathematical morphology operators, and then these tip points are classified in accordance with the location of the line using the nearest neighbor method. Secondly, the text baseline is obtained from the classified tip points based on the least-square method. Finally, the RANSAC estimation method is used to select the optimal baseline set and from which the line corresponding to the vanishing point is fitted. 2)Distortion recovery: The deformation parameters of images which contain in homography matrix can be calculated from the position of vanishing point and then the front view image achieved using homography matrix. Since the method obtained deformation parameters from characters of text out of limit of edge of the text and paragraph formatting, it can deal with the scene text image. The experiment results show that the recognition rate in deformed scene images has improved significantly after image correction using this method.2. Characters segmentation is a key step for recognition. The text images in natural scene often have a complex background, which cause difficulties separating the character from the background. A method for characters segmentation based on spectral clustering is presented. Main difference between the usual spectral method and our method is that the similarity matrix is constructed by quantifying the color space, which greatly reduces the time complexity for solving the eigenvalue system. Detailed steps are as follow: 1) Constructing the similarity matrix: Firstly, the original image is transformed into HSV space and quantified. Secondly, the similarity function is defined based on color, texture, and distance of pixels. Finally, the similarity matrix is constructed as color bins which quantified for the elements of matrix. 2) Solving the eigen-system: To establish a standard eigen-system using Laplacian matrix corresponding to the similarity matrix, and solve it to get the smallest eigenvalue and eigenvector. 3) Image segmentation: Firstly, the eigenvector corresponding to the smallest eigenvalue is divided into two classes, and then an indicator vector is built according to the division. Secondly, the similarity matrix is divided into two parts according to the indicator vector. Finally, the original image is segmented according to the matrix. A large number of scene text images are tested and the experimental results show that the method is superior to other methods in the literature.3. The features of natural scene text images determine the scene text low-quality. Existing methods can not handle the issues such as deformation, strong noise and low-resolution characters. A method for Chinese character recognition is proposed based on improved Gabor wavelet transformation in the dissertation. This method makes use the frequency selectivity of Gabor function and constructs a suitable wavelet transform to extract the feature of Chinese characters. And further the improved Gabor wavelet is proposed based on scale overlap and the direction pre-classification. The scale overlap reinforces the selection of Gabor filters to strokes width and the direction pre-classification makes the selection to the direction more accurate. As full consideration of the multi-peak distribution of character stroke width and direction, the wavelet parameters is optimized. And thus, a high robustness feature of Chinese characters is obtained. The tests on HCL2000 and low-resolution character library show that the method good performance and be able to deal effectively with low-quality Chinese characters.Finally, a text recognition system from natural scene is constructed based on several methods presented in this dissertation. Experimental results show the validity of the system. Because of less restrictive to the input image in this dissertation, it can be considered as a useful exploration towards the study of practical application system. Especially, the method of the spectral clustering based on color space quantization maybe can provide some idea for image segmentation. And the improved Gabor wavelet feature is a supplement for transform coefficients features of characters.
Keywords/Search Tags:Natural scene, Text recognition, Text image rectification, Text image segmentation, Gabor wavelet transform
PDF Full Text Request
Related items