Font Size: a A A

The Research And Implementation Of Content-Based News Video Retrieval System

Posted on:2010-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:X LiangFull Text:PDF
GTID:2178360272997034Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
For the past few years computer's processing ability is increasing and the expression of internet data presents various. The way of information submitting changes from plain text to multimedia that contains images, flash and videos. Among them video data become important resource in the internet. CBVR (Content-Based Video Retrieval) developed from the 90s of the last century. Different from text-based retrieval, it retrieves the object itself not the description of the object. It extracts information clues from media data depending visual features and space-time features in the video data and retrieves the similar media data in the database with these clues.Abroad research in this field began earlier, so some experimental prototype systems of video retrieval have already been developed. Interiorly the research started later, the technique lagged. For this reason there are few of large-scale interrelated application systems and the demand of video processing in the field of video-on-demand, medical treatment, and military affairs can't be met. But the situation has been paid close attention widely by the internal researchers.This paper carried out the research mainly on the arithmetic of shot detection, announcer recognition and the caption recognition with the DirectShow tools and made a primary quest in the respect of video structuring and semantic processing. The main work is as follows:1. DirectShow-based shot detection filterDirectShow is a member of Microsoft DirectX SDK it provides full solutions for media files playback, audio collection and high performance multimedia applications in the Windows platform.This paper developed a filter with specialty function. The filter contains a video input pin, a video output pin and a text output pin which encapsulates the arithmetic of shot detection and announcer recognition in it. It can be used in the video retrieval systems and other interrelated systems as a plug and play component. 2. News video shot detection and flashlight detectionThere are many shot detection arithmetic, for example, pixel-based arithmetic, histogram-based arithmetic, piece-based arithmetic, edge-based arithmetic, DCT-based arithmetic for compressed video, fuzzy clustering arithmetic and learning-based arithmetic.Recall ratio and precision ratio are important measuring index for detection effect, the higher they are the more effective the arithmetic is.Based on the research of numerous shot detection arithmetic, this paper uses the slide window arithmetic to detect shot cuts and flashlights in the news. The slide window is a 2R+1 long window sliding in the frame sequence. The position to be detected is in the middle of the window. We can use local frame difference to decide whether there is a shot cut and the kind of the cut with different test conditions. The feature of the flashlight is a shot duration and a big brightness change, but the people in the scene change slightly because of shot time.The recall ratio of shot cut detection shows 99.83% which means almost every cut can be detected the precision ratio is 97.40% and the reasons of miscalculation are mainly the camera wobble and the unexpected appearance or disappearance of the captions. The recall ratio of flashlight detection is 70.11%, but it makes no actual influence to the final detection the precision ratio is 84.72% and the reasons of miscalculation are the same.3. News unit splitA news unit is a group of meaningful continuous shots it tells a full semanteme and may contain several scenes or a single one. The whole news show is arranged by news units. People usually only want to retrieve a single unit or a few units in the news video, so news unit split is very meaningful.By means of statistical analysis, we can find some features of the news units: they begin with announcer shot there is a mute segment between different news units every news unit shows its main content within one quarter region below the screen with captions. We can conclude some method to split news unit by the above features: audio detection, caption detection and announcer detection.We use the template-based announcer detection arithmetic. The announcer region, the caption region and the logo region may change, but the background of the studio is steady for a long time and the camera potion, the illumination condition is also fixed. We select three fixed pieces in the background as the template to detect announcers. The experiment result shows the recall ratio is 96.60% and the precision ratio is 100%. 4. News caption recognitionAccording to sources, we classify the video semanteme extraction to three types: knowledge-based extraction, artificial interactive extraction and external information extraction.This paper chooses the caption recognition in the announcer shot to extract semanteme. Based on the specialty of Jilin University News, we find that every announcer shot contains a caption to generalize the main content of this news unit. So we recognize the caption in the announcer shot and describe this unit by the semanteme information extracted.First we capture one frame from announcer shot by the DirectShow method GetCurrentImage. Then execute the pretreatment including grayscale, interpolation, filtration and threshold. Finally send the frame to the OCR library of Microsoft Office Document Imaging to do recognition. The experiment result shows that the recognition ratio reaches 90%.So far we complete the shot cut and flashlight detection, announcer recognition and caption recognition and realize the preliminary video structuring and semantic processing. But due to the limited time, the difficulty of the project and the level of the author, there is still a distance to the real news video searching engine and some respects to be improved.
Keywords/Search Tags:CBVR, DirectShow, shot detection, announcer detection, caption recognition
PDF Full Text Request
Related items