Font Size: a A A

Research On Action Recognition Based On 3d-convolution Network Gradient And Activity Knowledge

Posted on:2022-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WangFull Text:PDF
GTID:2518306602993999Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the advent of the information age,video data has ushered in a big explosion.In the fields of surveillance,prevention and control,intelligent interaction,etc.,action recognition in video information has played a vital role.With the continuous development of external conditions such as computer computing power,the behavior recognition task enters a new stage of development,our work uses the deep residual network Res Net3 D as a baseline to conduct the following research:Aiming at the problem that the inter-layer features in the Res Net3 D network are all global descriptions of the behavior of the entire video in the time dimension,and lack of specific descriptions of the local regions of interest and specific descriptions of the behaviors in the video frame,an action recognition method based on Res Net3 D gradient feature enhancement is proposed.This method uses a pre-trained Res Net3 D network to derive the input frame to generate gradients,and uses gradient information to represent the local regions of interest in the input video that are moving.At the same time,a convolution deconvolution feature enhanced convolution neural network module is designed.The module is inserted into the low-level structure of the network,and the gradient feature enhancement loss is constructed by using gradient information to constrain the feature enhancement module,so as to realize the enhancement description of the local region of interest where the motion occurs in the video frame.Experiments prove that this method improves the action recognition results by1.5% and 1.7% on the UCF101 dataset and HMDB51 dataset,respectively.Inspired by the action knowledge base constructed in the work of Pa Sta Net and the Activity2 vec tool that can extract language Pasta features and visual Pasta features,a action recognition method based on human body part action knowledge base and multi-head graph convolution feature fusion network is proposed.This method uses the Res Net3 D network to extract global features to construct a graph structure corresponding to human body parts.At the same time,a multi-head graph convolution feature fusion network is used to perform information fusion and classification on the constructed graph structure.Experiments show that this method improves the results by 2.1% and 1.6% on the UCF101 and dataset HMDB51 dataset,respectively.Aiming at the problem that the features extracted in the previous chapter have their own independent characteristics,and the method of fusing three features into one feature cannot fully reflect this feature,a action recognition method based on ensemble learning is proposed.This method constructs five weak classifiers with similar structures.The input of the first three classifiers is the above three features,and the input of the latter two classifiers is the features after concatenation.At the same time,this method proposes a dynamic crossentropy loss function to integrate and classify the results of different weak classifiers.Experiments show that this method improves the results by 2.5% and 1.9.% respectively on the UCF101 dataset and HMDB51 dataset.
Keywords/Search Tags:Action recognition, deep learning, feature enhancement, integrated learning
PDF Full Text Request
Related items