Font Size: a A A

Scene Segmentation For Movie Video

Posted on:2006-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:H T DingFull Text:PDF
GTID:2168360155453084Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of computer network and communicationstechnology, multimedia, in particular, video has become more and more anindispensable part of the everyday work and life of human beings. The difficulty inthe processing of video information gave birth to the new techniques ofcontent-based video analysis and retrieval. Video segmentation is the first steptowards content-based video analysis. Shot is a basic unit of video. Related shotsare grouped into a high-level unit termed scene. Choosing the movie video as asubject and based on recent advancements in this field, this paper explores themethods of scene segmentation systematically.First, we introduce some fundamental knowledge of digital video, mainlythe color spaces and their transform as well as concepts about MPEG. Commoncolor spaces in digital video are RGB, YCbCr and HSV. Since the HSV colorspace is approximate to the way human eyes sense and interpret color, the HSVcolor space is used widely in content-based video applications. MPEG is a popularinternational video compression standard, and we introduce some relatedknowledge of MPEG in this paper.Shot boundary detection is the first step towards content-based videoanalysis. Approaches to shot segmentation can be classified as those that operate inMPEG compressed domain and those in uncompressed domain, or as those thatdetect cuts and those that detect gradual shots. Many algorithms operate in theuncompressed domain, and the best known (and simplest) methods among themare those based on comparison of histogram differences and pixel differences. Forthe methods in the compressed domain, the video needs not to de decoded fully;instead, the shot boundaries are determined using the compressed domain featuressuch as DCT coefficients, macroblock type information, and motion vectors, andthis usually requires only partial decoding. One of the difficulties in shot boundarydetection lies in the detection of gradual shot transitions, that is, how to distinguishthe changes brought about by gradual shots (very small) from those arising fromcamera and object motion within a shot. We explore and implement a method forthe detection of shot boundaries utilizing macroblock type information in MPEGcompressed domain. The idea is that at a shot boundary the shot content changesso that the motion compensation is not effective. Since the B frames immediatelybefore the shot boundary are not similar to the reference frame after the shotboundary, most macroblocks are forward compensated; since the B framesimmediately after the shot boundary is not similar to the reference frame beforethe shot boundary, most macroblocks in the B frames are backward compensated.Likewise, most macroblocks in the P frames before the shot boundary are intracoded. By counting the number of predominant macroblock types of every frame,the shot boundary can be accurately determined. As the experiment demonstrates,this method is very fast and computationally efficient, and therefore is a veryexcellent algorithm. Key frames (or representative frames) can reduce the load of computationgreatly in shot similarity comparison. Key frames can be selected statically, that is,from one or several fixed positions within a shot; or we can choose key framesdynamically, that is, we choose those frames as key frames that best capture thechanging content of a shot. In this paper we select the first and last frame of a shotas key frames for the sake of simplicity. On the part of scene segmentation, we introduce the definition of a scene andits two features (visual similarity and time locality), then discuss several scenesegmentation algorithms, and finally show the algorithm proposed by us. A sceneis a series of shots that are temporally local and semantically related. Two shotsbelonging to the same scene are close in the temporal distance and likely to besimilar visually; therefore, temporal locality and visual similarity are two commoncriterion used to determine the inter-shot similarity. In most scene segmentationalgorithms shot similarity is computed first, and then the related shots are groupedinto a scene using clustering methods. These methods include time-constrainedclustering, time-adaptive grouping, overlapping shot links and shot neighborhoodcoherence and so on. Note that in existing methods using overlapping shot linksonly local color features are utilized in computing shot similarity. Motioninformation is also an important feature of a shot in addition to color information.Considering this, we propose a new scene segmentation algorithm using bothmotion and visual features of shots. In this method, we represent the shot visualsimilarity as the intersection of global HSV histogram shots (via key frames ofshots), and the shot similarity as the weight sum of shot visual similarity and shotmotion similarity. The shot motion content is computed as the accumulation sumof the absolute differences of consecutive frame histograms within a shot. Then,we make use of the overlapping link method to merge the related shots into scenes.Since the perfect segmentation is impossible, and in general the over-segmentationis preferable to under-segmentation (because the over-segmented scenes can bemerged, and the under-segmented scenes are difficult, if possible, to recover), weset the clustering parameters such that an over-segmentation is produced. Finallywe merge the over-segmented scenes. The experimental results show thecorrectness of our approach. Given the complexity of scene segmentation, it is notenough to consider only visual features and future work shall be to combine audioand visual features to improve the segmentation result effectively. Since most video is stored or transmitted in the compressed form, it makes...
Keywords/Search Tags:scene segmentation, shot boundary detection, DC image, MPEG, compressed domain, shot similarity
PDF Full Text Request
Related items