Font Size: a A A

Research On Text Extraction In Digital Video

Posted on:2012-10-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:1118330338465673Subject:Physical oceanography
Abstract/Summary:PDF Full Text Request
Nowadays, the speedy growth of video resources bring about an urgent demand for efficient Video Information Classification and Retrieval system which could help customers acquire interesting video or video clip from huge amounts of unstructurized video data. Among these techniques, text extraction method has become a very meaningful research topic because the text in frames have close relationship with the video content. Besides, many mobile devices have been equipped with high-performance camera, such that images and videos containing text can be easily captured when necessary. If these texts can be automatically discovered, many utilitarian applications (e.g. translation, special service for blind person, machine vision and intelligent traffic system) can be provided for users.However, the embedded text in video frames have different size, style, direction and arrangement, as well as low contrast and complex backgrounds which make the text extraction problem very complicated.?This dissertation focuses on the research in the crucial problems of video text segmentation, including video text location in single video frame, multi-frame video text tracking, video text enhancement, video text segmentation application (news video story segmentation, text detection of road signs system).The main works of this dissertation are as follows:An edge detection approach combining gray-scale mathematical morphology with wavelet transform is proposed for coarse filtration first.This edge detection method combines the advantages of both wavelet transform and morphology methods together to fuse the two edge information obtained by different method,thus suppressing effectively the noises with the consecutive and clear edges kept up. Next, a density-based region growing method is used to join these pixels into text regions. Finally, A algorithm based on binary particle swarm optimization was presented and applied to optimize feature selection and parameters of SVM simultaneously which is used to identify true text from the candidates. Experimental results show that this approach can fast and robustly detect text lines under various conditions.A video text tracking and text extraction method under complex background is proposed. On the basis of comer detection of curvature function,a point matching method is introduced to track text objects for which a modified Hausdorff distance is used to find and register the corresponding text block in video frames. The algorithm can avoid detecting text in every video frame which improves the system efficiency a lot. Next, a multi-frame-based foreground/background recognition algorithm is proposed to extract text strokes for optical character recognition. The efficiency and robustness of the point matching method for video text tracking and the text extraction algorithm are proved by objective and thorough experiments on TV serials and movies.A novel news story automatic segmentation scheme based on video,audio and text information is proposed. Firstly, the shot boundaries for news video is detected, then the topic-caption frames are identified to get segmentation cues by using text detection and tracking algorithm in previous chapter. Next, depending on the Gauss Mixture Model and KL divergence method, every video shot is identified as announcer or un-announce type by using voice recognition. Finally, the news story unit segmentation is carried on under the special structure knowledge of news program.A fast and robust approach for the extraction of text on road signs based on color and stroke is proposed.First, a novel color model derived from Karhunen-Loeve(KL) transform was applied to find all possible road sign candidates. Then, affine transformation was performed to restore road signs to let every road sign seems to be vertical to the camera optical axis which can improve the accuracy in detecting texts embedded in road signs. Finally, mathematical morphology and region growing algorithms were used to obtain a clearer binary picture which is sent to OCR software. Experimental results demonstrate the great robustness and efficiency of proposed algorithm.
Keywords/Search Tags:Text Localization, Video text tracking, Corner feature matching, Story segmentation, Vehicle navigation
PDF Full Text Request
Related items