Font Size: a A A

Stereoscopic Video Feature Extraction And Classification

Posted on:2020-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y N ZhongFull Text:PDF
GTID:2518306452972439Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of multimedia,stereoscopic 3D video analysis has become a hot topic in the field of computer vision.The 3D film which imitates the human binocular vision system has brought creative changes to the film industry.Due to the few researches focus on the stereoscopic video feature extraction and classification.In this paper,we use the stereoscopic video to begin the research and aim to answer whether a stereoscopic 3D film is suitable for children to watch.We first extract the features of animated and non-animated types of stereoscopic 3D videos,and then propose a machine learning-based stereoscopic 3D video type classification algorithm.The main works in our paper as follows:(1)Because the stereoscopic 3D films contain a huge number of frames,there is no big stereoscopic 3D film database.To solve this problem,this paper first choose some animation and non-animation types of stereoscopic 3D films to established a high-definition,subtitle-free,and left-right format stereoscopic 3D film dataset named3 Dfilms.Each film is first processed into video shots using a scene detection algorithm.Then the frames in each video shot are preprocessed to the left and right views.The experiments on this paper are all based on the created dataset 3Dfilms.(2)The features of image have important application value in image quality evaluation,image retrieval,and image classification,therefore this paper studies the stereoscopic video multi-class features extraction.We first extract the image and video features of each stereoscopic 3D video.In this paper,we extract multiple image and video features,including image aesthetic features,distortion features,and video features.Secondly,because the stereoscopic 3D video imitates the human visual system,and it uses the disparity to form the stereo sense in human visual system.In this paper,depth information is quantified as disparity features.In addition,according to the human visual attention mechanism,moving objects in the foreground region are more likely to attract human visual attention,so the foreground region has a stronger impact on human eyes than the background region.Different from the previous researches which only consider the global feature extracted from the whole frame,this paper proposes extracting local disparity features based on the foreground and background segmentation.Specifically,we segment the foreground moving objects and obtain the motion saliency map of every frame.Then,we binarize the motion saliency map using the OTSU thresholding method,and the binarized motion saliency map is applied to the feature extraction of the local foreground and background disparity features.(3)Proposed a stereoscopic video classification method for animation type and non-animation type based on machine learning.The random forest classification algorithm is used to train the stereoscopic 3D video type classification model with the image features,video features,and disparity features.The experimental results show that when using image features,video features or disparity features,the accuracy rate of stereoscopic 3D video type classification increases as the increase of the number of continuous video shots.When image features,video features,and disparity features are used together,the classification accuracy rate is higher than only use one type of features.When using a combination of these three types of features,the classification accuracy rate of using the features extracted from a single shot reaches 92%,and that of using the features extracted from two consecutive shots increases to 98%.In summary,the multi-class features extraction in this paper are effective,with the machine learning to train the classification models show a high accuracy rate.Since the non-animation type stereoscopic 3D film is usually not suitable for children to watch,the classification result can serve as a guide for whether the stereoscopic 3D film is suitable for children to watch.
Keywords/Search Tags:Image and video features, Depth information, Stereoscopic 3D films, Foreground segmentation, Machine learning
PDF Full Text Request
Related items