Font Size: a A A

Video Text Extraction Technology Research And Application

Posted on:2011-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:F X LiFull Text:PDF
GTID:2208330332977351Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Text in videos is one powerful source of high-level semantics. If the text could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieving the explosively increasing digital videos. The traditional character extraction methods were specially developed for the scanned images and they cannot effectively extract the text in videos, especially the text with complex backgrounds. Therefore, it is necessary to develop new methods. There are five challenges to extract text in videos: (1) how to localize the text which may be with complex backgrounds;(2) how to binarize different kinds of text images; (3) how to segment the characters with complex backgrounds; (4)how to segment the merged characters; (5)how to recognize the degraded characters.To solve these problems, this dissertation involves with the following aspects:1. A novel method to binarize different kinds of text images is proposed. The method is based on fusing several binary images. First, the locally adaptive seed-fill method, the locally adaptive thresholding method and the stroke-model-based method are respectively used to get three binary images. Then, the final binary image is gotten by fusing these three binary images. Compared with other methods, the proposed method can greatly improve the character recognition accuracy.2. For character segmentation, the characteristics of the text image are analyzed and a novel heuristic method based on character recognition is proposed. The proposed method, which can not only segment the merged characters or the characters with complex backgrounds but also remove the"noise"components in the segments, overcomes the drawbacks of the heuristic method.3. To precisely recognize the degraded characters, a novel character recognition method based on fusing image is proposed. Compared with the binary image or gray image, the fusing image can not only preserve the useful information of character strokes, but also remove the noisy information of complex backgrounds. The proposed method first fuses the binary image and gray image of the character. Then, based on the fusing image, the character recognition engine gives several candidates. The post-processing approach based on statistical language model is proposed to select out the best character sequence of the text image. The proposed method can greatly improve the character recognition accuracy.4 A novel method to localize the text in images and videos is proposed. For images, the text regions are detected based on the character stroke model. Then, the regions are decomposed vertically and horizontally using edge maps of the image to get candidate text boxes. Finally, a text box verification step based on character recognition is taken to reduce false alarms. For videos, the image-based text localization is only applied to every nth frame, the localized text boxes are tracked backward and forward in time to all frames containing the respective text box. The proposed methods in this dissertation solve some problems of text extraction in videos.
Keywords/Search Tags:Text localization, text image binarization, character segmentation, character recognition, integrated segmentation and recognition
PDF Full Text Request
Related items