Font Size: a A A

Video Semantic Analysis Based On Deep Network With Multi-Graph Regularized Auto-Encoder

Posted on:2020-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:J Y FangFull Text:PDF
GTID:2428330596496913Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of Internet technology and multimedia information technology,it is more convenient for users to get access to video data and transmit information on the network.It also caused the rapid growth in the number of multimedia data such as videos and images on the network.Video data is one of the important data sources and contains rich data information on the current Internet.However,the amount of video data is large and the data structure is complicated.The development of video data and the demand for intelligent video processing have promoted people to analyze data about the video semantic and identify the semantic label of video to achieve more efficient management and retrieval of data.In the field of video management and retrieval,it is a hot topic to study that how to extract the effective features of video and use these features to realize video semantic analysis and video concept detection.Based on a large number of domestic and foreign literatures,this thesis introduces the research background,significance and current situation of video semantic concept analysis firstly.Then several deep learning models are briefly introduced,such as autoencoder model and convolutional neural network model.In addition,a brief description of the application of video semantic concept analysis based on deep learning is given.Combining the advantages of multi-graph regularization and the development requirements of image and video feature extraction,this thesis proposes a multi-graph regularized auto-encoder network model and a video semantic concept analysis model based on 3D convolutional neural network(3DCNN)and multi-graph regularized autoencoder(MGAE).Finally,a prototype system for video semantic concept detection based on the proposed network model is designed and implemented.The main research contents of this thesis are as follows:(1)A multi-graph regularized auto-encoder network is proposed.Due to the diversity of multimedia data such as images,video or audio,traditional methods are not suitable for multi-view modeling of such data sets.It is an important issue to study how to combine the graph construction method in manifold learning with auto-encoder for modeling multi-view multimedia data representation.Therefore,based on several graph constructions and the Laplacian auto-encoder,the multi-graph regularized auto-encoder network is proposed.The method embeds the multi-graph regularization constraint into the auto-encoder,so that the learned and extracted features can consider the relationship of neighborhood,association and classes in terms of samples which have better generalization capability.This network is used for the optimization and learning of image features.The experimental results show that the proposed method can improve the multi-views features,thus improving the accuracy of image classification.(2)A video semantic concept analysis model based on 3D convolutional neural network and multi-graph regularized auto-encoder is proposed.Firstly,this model constructs a 3D convolutional neural network for extracting video features.Then it constructs a multi-graph regularized auto-encoder to further optimize the extracted features of video.Thus it not only can acquire the temporal information and spatial features of the video,but also learns and extracts the video features with relevance and multi-view.This model is enabled to extract more reasonable and discriminative video features so that the effectiveness and accuracy of the video semantic concept analysis are improved.The experimental results on the typical video datasets show that the proposed network model can optimize the video features more reasonably,which can effectively improve the accuracy of video semantic concept analysis.(3)By using the object-oriented programming idea,Python and related additional library,a prototype system for video semantic concept analysis based on 3DCNN and multi-graph regularized auto-encoder is designed and implemented.The system includes three subsystems: video data preprocessing,model training and semantic detection.The system interface is simple and users operate easily.The usability of the video semantic concept analysis methods proposed in this thesis is verified.
Keywords/Search Tags:Deep Learning, Multi-Graph Regularization, Auto-Encoder, 3DCNN, Video Semantic Concept Analysis
PDF Full Text Request
Related items