Font Size: a A A

Deep Learning Based Video Shot Detection And Object Segmentation

Posted on:2019-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:J W XuFull Text:PDF
GTID:2428330590467426Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the age of Big Data,more and more videos are waited to be processed,analyzed and mined.However,plenty of videos have no proper annotations or no annotations at all,which brings an enormous challenge for people to search and process video segments they are interested in.Thus it is essential to develop a series of high efficient video structural analysis techniques.What's more,video shot detection and video object segmentation are fundamental and significant steps for video structural analysis.In terms of video shot detection,we extend some alternative solutions as follows: we first propose a high efficient video shot detection framework and point out several essential components of the framework(preprocessing,feature extraction and shot detection algorithms).We also give some recommendations on designing corresponding components.Then we propose a deep learning based video shot detection algorithm according to the framework.In proposed algorithm,we use a bisection comparison based method which is recommended in the framework for preprocessing.Through preprocessing we could filter large-scale frames which belong to no shots and obtain candidate video segments containing possible shot boundaries.We leverage AlexNet to extract deep features and select the most representative feature called fc-6 for subsequent operations.In detection of cut transitions,we consider both general similarity in the segment and similarity of continuous frames.We further define a standard to detect whether the similarity changes abruptly,which help to detect cut transitions accurately.In detection of gradual transitions,we obtain a general pattern of gradual transitions called ”inverted-triangle pattern” through experiments and analyses.We further propose several pattern matching standards for it,which promise the high efficiency and stability in detection of gradual transitions.Experiments illustrate that proposed deep learning based shot detection algorithm outperforms state-of-the-art methods significantly.In terms of video object segmentation,we extend an alternative solution as follows: we propose a two-stream deep encoder-decoder architecture for fully automatic video object segmentation.From our perspectives,both intra-frame segmentation and inter-frame segmentation are dispensable for video object segmentation,thus we proposed to leverage two stream networks to segment single frame and corresponding inter-frame moving information respectively.Both two streams have the same encoder-decoder architecture,the only difference is the input: the former is the frame in video,while the latter is the corresponding optical flow RGB color map.The encoder part mainly aims to efficiently process an image to obtain a coarse segmentation result——low resolution,explicit object location,ambiguous edges and details.The decoder part is to gradually refine the coarse segmentation using structural features from the encoder part,where the edges and details could be refined progressively.Through the deep encoder-decoder part a dense segmentation is obtained whose size is the same as that of input image.Finally segmentations from two streams can be integrated and promoted to a much better result.Experiments show that proposed algorithm have competitive performance against state-of-the-art algorithms,leading to high efficient video object segmentation.
Keywords/Search Tags:Shot detection, object segmentation, convolutional neural networks, intra-frame segmentation, inter-frame segmentation
PDF Full Text Request
Related items