Font Size: a A A

Research On Some Key Techniques Of Video Retrieval Based On Content

Posted on:2013-01-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:S S LeiFull Text:PDF
GTID:1118330371490765Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The video is a continuous time series of imageframes, and is a image stream without data structure. If the video is seen a book without catalog and index, then an image frame is equivalent to one page of the book. Due to the lackness of catalog, people can not efficiently browse and retrieve. With the extensive application of video information, and the dramatic increase of vdieos, effective organization and management have been considered a very important and challenging research topic. This dissertation focuses on in-depth study several key technologies in the video retrieval based on content, Including shot boundary detection, key-frame extraction, image low-level features selection and image semantic recognition. The main innovations are summarized as follows:In shot boundary detection, existing methods calculate differences between two adjacent frames to get the shot borders, but the difference of adjacent frames is more sensitive to flash, object and camera motion. This dissertation presents a shot detection method based on distance separability criterion, by calculating the difference between two video clips in sliding window to determine shot borders, which can effectively suppress the flash and the object/camera motion. This method can effectively distinguish the flash and cut transition, the object/camera motion and gradual transition.Existing key frame extraction is lack of spatio-temporal analysis of a video. They are difficult to identify the number and the locations of the key frames as a whole. The first method first splits a shot into several sub-shots with similar visual content, followed by spatial and temporal analysis to identify key frames according to the change rate of video content. This method can effectively reflect the dynamic characteristics of the video. The second method first constructs a space-time slice to extract the spatio-temporal information of the original video, and then use K-mean clustering and some rules to achieve the effective extraction of key frames. The two methods contain spatio-temporal analysis, and the extraction result is consistent with human visual perception.The role of key frames includes constructing video summary and providing index of video clips. The existing methods are actually oriented to video summary, when the result is used for video indexing, the result is too redundant, resulting in low retrieval efficienc. This dissertation presents the idea that different extraction strategies should be applied to different applications, camera movementand lens performance techniques to extract key frames for videoindex. This dissertation presents a hierarchical camera motion classification algorithmon based on motion direction histogram, followed the basis of camera motion qualitative analysis, lens performance techniques were applied into the effective extraction of key frames. The experimental results show that the method can capture the main video information to provide a concise index structure for later retrieval.Existing feature extraction methods can not make a connection of knowledge systems and classification, and can not find and reason the relationship between the individual data, also can not effectively deal with inconsistent, incomplete information to find the implied knowledge, to reveal the potential laws. In this dissertation, knowledge reduction based on rough set theory is applied to the image semantic feature extraction. Under the premise of knowledge is not affected, by constructing attributes decision table and reducing the attributes, the effective low-level feature set can be extracted to lay the foundation for image semantic recognition.This dissertation investigates the classification performance of support vector machine (SVM) in the semantic recognition of landscape images. SVM classification performance is determined by the kernel function and parameters, therefore this dissertation analyzes the impacts of the different kernel functions and parameter optimization algorithms for semantic recognization performance of landscape images, and ultimately on the basis of effective low-level features set, optimized SVM was used to i to obtain a higher recognition accuracy.
Keywords/Search Tags:video retrieval, shot boundary detection, key-frame extraction, feature selection, semantic recognization
PDF Full Text Request
Related items