Font Size: a A A

Research On Behavior Recognition Technology Based On Deep Learning

Posted on:2020-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:X X LaiFull Text:PDF
GTID:2438330602450193Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and multimedia technology,video-based media has been widely applied to people's daily life and work.Deep learning has great advantages in the field of computer vision.The effect of video description and classification of fine-grained images is difficult to achieve by traditional methods.Therefore,the application of deep learning technology in the field of action recognition has become a hot research direction for researchers at home and abroad.The essence of action recognition technology is the classification of video frames.Currently,the commonly used models in action recognition technology are based on convolutional neural network.Convolutional neural network has been successfully applied to image classification problems,which has the advantages of less parameters and translation invariance,and which can achieve better recognition rate on various action data sets.In this paper,based on the 3D convolutional neural network,a new behavior recognition model is built,which can extract the content and video motion characteristics of the video more effectively,and learn the massive information by referring to the human visual attention mechanism.The attention mechanism is introduced into the model,so that the model pays more attention to the important information in the video and ignores the redundant information.Finally,the model is verified and analyzed on the action recognition data set.The main contents of this paper include:(1)Designing a space-time dual-flow CNN-GRU neural network architecture.Aiming at the problem that using convolutional neural network to extract the spatial and timing features of video in the original dual-flow architecture leads to insufficient utilization of video information and the inability to really learn the timing features of the video,this paper proposes a deeper network architecture based on 3D convolutional neural network based dual-flow neural network combined with GRU(Gated Recurrent Unit)network.The experimental verification of the proposed framework is carried out on the action recognition dataset UCF101 and HMDB51.It proved that the space-time dual-flow CNN-GRU neural network architechture proposed in this paper has a certain improvement on recognition rate compared with the similar methods.(2)Improved the loss function in the space-time dual-flow CNN-GRU neural network architecture.In this paper,based on the relationship between the step factor and the error signal in information theory,a new fine-tuning algorithm based on the correlation entropy loss function is proposed according to the designed space-time dual-flow CNN-GRU neural network.After noise processing on UCF101 data set,experiments verify the robustness of the new fine-tuning algorithm based on the correlation entropy loss function on the data set with noise.Aiming at a large number of noise and outliers problems in the dataset which it is difficult to deal with in the existing models,this paper propose an improved correlation entropy loss function fine-tuning algorithm by studying the relationship between the step factor and the error signal in the information theory and combined with the advantage of non-Gaussian noise and impulse noise processed by the correlation entropy,which improves the robustness of the space-time dual-flow CNN-GRU neural network architecture on the noisy data set.After the noise processing of the UCF101 dataset,the experimental results show that compared with the existing leading algorithms,the proposed algorithm has little difference on the recognition rate on the non-noise dataset,but improved by 0.19% on the noisy dataset,which indicates that the space-time dual-flow CNN-GRU neural network architecture based on the improved correlation entropy loss function fine-tuning algorithm is more robust to the noisy data.(3)The paper introduces the attention mechanism into the space-time dual-flow CNN-GRU neural network.It models the spatial stream and the time stream feature vector and adaptively weights the spatial stream feature vector and the time stream feature vector by assigning the calculated Attention score to each feature vector by way of supervision,instead of the traditional spatial stream feature vector and the time stream feature vector maximum or average fusion mode.The feature-adaptive weighted fusion method based on supervised mode makes the model pay more attention to important features and reduce redundant information of the model in the process of training.Finally,experiments were carried out on the action recognition datasets UCF101 and HMDB51,which proved that the recognition rate of the space-time dual-flow CNN-GRU neural network architecture based on attention mechanism reached the leading level.
Keywords/Search Tags:Action Recognition, 3D CNN, GRU Network, Correlation Entropy Loss Function, Attention Mechanism
PDF Full Text Request
Related items