Font Size: a A A

Research On Text Extraction In Scene Image

Posted on:2011-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChangFull Text:PDF
GTID:2178360305474536Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The texts in images can reflect some important information of image or video content, if these texts can be extracted automatically, it will be useful for information retrieval, digital libraries, Web search, and intelligent transportation and other fields. Our paper focues those Scene text which are not only without limiting the font size and unchanged colors, but also may retain a uniform illumination conditions and background textures.To solve these problems, we researched on ICARD image library, using the color and edge features of scene text, studies the scene text extraction methods, and developed a simple scene text extraction system.The main contributions of this research include:(1) Focusing on the the horizontal and vertical edge features of text in scene images, we used color sobel edge operator to extract the image, in order to get clear closed edges, we made use of Canny operator on the edge image for the second edge extraction, after superposition of the two results, then through the edge filled, connected component analysis, heuristic knowledge, combined with morphological filtering on the image, we got the texts in the image. Experiments show that the text extraction method are not always effective.If the images have more details and more complex background, this method often can not effectively extract the texst from these images.(2) For some complex background and color images, we used hill-climbing algorithm in the CIE Lab space to determine the optimal number of clusters and cluster centers and then we used K-means algorithm for clustering which achived good segmentation results.(3) We proposed a natural scene text extraction method which based on K-means clustering and combined with edge detection, throguth improved the K-means clustering algorithm, text region segmentation, and after the segmented binary sub-images graph decomposition, connected region labeling and analysis, we got the candidate of the character area, and then we used the binary edge image to filter the candidate text area, finally we realize the extraction of text characters. The results show that this method can extract the text characters in the image which have complex background, lighting effects and textures.(4) We using VC++ programming environment and OpenCV technology to develop a simple scene text extraction system, which has a modular functional structure, friendly system interface and a stable operating performance, through the image input module, image processing module and image output module, using edge based text extraction algorithm and clustering combined with edge text extraction algorithm that can effectively extract the text in images.
Keywords/Search Tags:Scene text, edge features, CIE Lab, climbing algorithm, K-means clustering, complex background
PDF Full Text Request
Related items