Font Size: a A A

Research On Video Classification Method Based On 3D Spatiotemporal Features And Context Information

Posted on:2021-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:K WangFull Text:PDF
GTID:2518306050471684Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Video classification studies the content contained in videos and identifies several key topics related to videos.The main purpose of action recognition is to determine the category of human behavior in a video,so action recognition can be regarded as a sub-field of video classification.Video classification has broad application prospects in video recommendation,intelligent monitoring,and human-computer interaction.With the development of the Internet and multimedia technology,the scale of video data is increasing,and the content is more complex.Traditional methods have encountered great difficulties.With the continuous development of deep learning technology,deep networks provide new ideas and methods for solving large-scale video classification problems.The application of deep learning technology has made the video classification task enter a new stage,the accuracy of video classification has been continuously improved,and the video classification has gradually developed from action recognition in simple scenarios to a large-scale multi-label general video classification task.This paper mainly studies visual-based video classification and action recognition methods.It mainly processes and classifies the sequence of data frames extracted from video data,without considering other video data information.A video classification method based on 3D spatiotemporal features and context information is proposed,and label smoothing and the mixture of experts model are further introduced for video classification.The proposed method is analyzed and experimentally verified on the action recognition data set,and finally the development of the video classification task and the research direction are prospected.The main contents of this article include:1.A video classification method based on 3D spatiotemporal features and context gating is proposed.The three-dimensional convolution operation can directly extract the time and space information of the video,and the identity mapping introduced by the residual structure can make the network easier to optimize.The network can recalibrate the different visual activations of the input using the context-gating unit based on the attention mechanism to capture the interdependence between depth features.Obtained the representation information that is significantly associated with the output,which improves the model's ability to discriminate.Finally,the method proposed in this paper is tested on the action recognition data sets UCF101 and HMDB51,which proves the effectiveness of the proposed method.2.A video classification method based on 3D spatiotemporal features and label smoothing is proposed.The label smoothing function is used to modify the label of the video data.During the training process,the network learns the loss of wrong labels and enhances the learning ability of the model.Label smoothing has a regularized effect on the loss function of the network,which can alleviate the problem of network overfitting.Experiments prove that the introduction of label smoothing technology improves the classification accuracy of videos.3.A video classification method based on 3D spatiotemporal features and the mixture of experts model is proposed.The hybrid model uses multiple expert models to learn specific patterns of data,and uses a learnable gating network to integrate the prediction results of all expert models.Different expert models can compensate each other for errors caused by other expert models,and finally get a mixed model with stronger generalization performance.Finally,experiments are conducted on the action recognition data set,and the effectiveness of the expert mixed model is further discussed and analyzed.
Keywords/Search Tags:Video Classification, 3D CNN, Gating Mechanism, Label Smoothing, Mixture of Experts Model
PDF Full Text Request
Related items