Font Size: a A A

Research Of Model-Based Video Shot Segmentation And Key-frame Extraction Algorithm

Posted on:2008-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:X M LiFull Text:PDF
GTID:2178360212996923Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of the Computer Science, the Network Technology, people's needs for information has increased rapidly and the Multi-media information has become one of the most important data source. Among all sorts of multi-media data, video data proportions more and more, and the digital video are being utilized in each aspect of people's living, there will be a large amount of video produced, so how to manage and how to index the database efficiently and rapidly, are the most important questions people are being confronted with. Under this condition, people are facing the challenge of information retrieval, information expression and so on because of the traditional methods of indexing. Now, people could index and browse the video by pure characters only. But the existing of the various kind of information make it is nearly impossible to do the retrieval as usual, so the Contend Based Video Retrieval (CBVR) technology has becoming more and more important.The CBVR can provide such a method that, it can describe video content and extract the feature of the video automatically without people's participation, it is a synthetically technology based on Recognition Technology, Human Intelligence, Database Management, Man-machine Interface, Information Retrieval and so on, so that we can design the index algorithm and system structure reliable and efficiently, the man-machine interface friendly.At this present, excluding the recognition and description of color, texture, shape and special relation of image, the main research of CBVR are focusing on video shot segmentation, feature extraction and description (including vision feature, color feature, texture feature, shape feature, motion information and object information and so on), key-frame extraction and video structural analysis The crucial goal of video structural analysis technology is to extract the main content of the video, and then make a structural description down-to-up. For this aim, we must segment the video, extract the video feature, and organize the video content etc.In our paper, we do some reach on the key technology of video retrieval- shot segmentation utilizing video edit theory, and then propose a novel algorithm for key-frame extraction combining the difference between frames in the shot and the square of the difference between each frame and the mean luminance of shot.We first summarized the current main algorithms for shot segmentation systemically, and then analyzed some common edit technology, cut, fades and dissolve, extracted the different features depending on the related edit model. After we analyzed and justified all the potential shot boundaries, described the shot content by shot key-frames.To detect the cut transition, because we did not edit the cut shot boundary, it does not have an edit model. So we must use the shot information for our detecting. In our paper, we improved the histogram-difference algorithm, which is a very classical algorithm in cut shot detection. First: we improved a new adaptive threshold method based on a local window, instead of the fixed threshold and the adaptive threshold by calculating the mean value of the histogram or the variance of the histogram. Our algorithm can confirm the threshold automatically by comparing the max histogram difference and the mean histogram between among the sliding window. We also add a constant to the histogram difference to decrease the difficulty of threshold choosing and discriminate the effect by constant.To detect the fades, we consider a first order of luminance mean is a suitable feature vector, because during a ideal fades, the mean luminance of the frames changed linearly, so the first order of the video luminance mean is a constant, meanwhile, there will be large negative peaks at the beginning of the fade out and the ending of the fade in First we justify all the potential shot boundaries by seeking the monochromic frame in the video, then we try to find the negative peaks of the second order of the luminance variance around the monochromic frame, because the negative peaks can also brought by object motion, we first smoothing the first order difference curve by median filter before we detect the constancy of the first order difference frames, at last, we constrain the variance of the beginning frame of fade out and the ending frame of fade in to discriminate the error by dark scene.To detect the dissolve transition, we consider the first order of the mean luminance. Similar as in fades, the first order of mean luminance changed linearly and will change its sign during the dissolve transition. So first we constrain the lasting time of the dissolve to fins all the potential dissolve shot, and then we search the zero-crossing point to detect. We also do smoothing processing to the first order of the mean luminance with the median filter. Then we compensate the incorrect beginning and ending frames brought by smoothing. At last, we constrained the luminance variance to decrease the effect of the noise and object motion.In the last step work of shot segmentation, we discriminate the camera flash, the motion of the object in front of the camera and other instantaneous noise.The key-frame extraction algorithm, we first compare the difference of the frames in a shot pixel-wise to claim some key frames, the combine the square of the difference between each frame and the mean luminance of the shot to fine the selected key-frames, so that it can ascertain the number of key-frames and which one is the most suitable automatically. The algorithms can describe the shot exactly and across s-the-broad. The simulation result indicated that, our method of shot boundary detection is very simple , both the recall and precision are very high, and it is robust to various kind of video, meanwhile, it can combine with the simple histogram algorithm, to finish both cut and gradual shot transition and it is also very practical. At last, the key-frame algorithm we proposed can summarize and describe the video content roundly.
Keywords/Search Tags:Vdieo Retrieval, Video Edit Model, Shot segmentation, Key-frame, Extraction
PDF Full Text Request
Related items