Font Size: a A A

Research On Visual Action Recognition Based On Deep Learning

Posted on:2022-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:X ShiFull Text:PDF
GTID:2518306494970749Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Recently,action recognition has gradually become the focus in the field of computer vision.It can identify the target motion category by capturing the motion features of the people in the video image.The application prospect of action recognition is broad,such as intelligent driving,unmanned supermarket,intelligent traffic system.The traditional methods of action recognition are realized by a lot of artificial designs and prior experience in different application scenes.These methods are not universal and have certain limitations in application scenes.With the advent of deep learning algorithms,image classification,face recognition,machine translation,and other fields have been made great progress.The deep learning algorithms can self-study target features from a large number of data sets,avoid the tedious steps and high empirical difficulties of manual design feature extraction method,and can be applied to multiple different scenes at the same time.With the growth of data sets and the improvement of algorithm complexity,deep learning algorithms need high-performance hardware devices to achieve.To reduce the dependence of action recognition technology on computer hardware equipment,this paper designs an efficient method.It maintains the approximate accuracy of recognition and reduces the computational cost of the model by about 1/3 and the parameters of the model by about 2/5.In this paper,the main research of action recognition is as follows:1.The network of action recognition is improved by depthwise separable convolution,which greatly reduces computational cost and parameters at the expense of less accuracy.The standard convolution operation usually uses a convolution kernel with the same number of output channels to complete filtering and extend the feature map to each channel.While depthwise separable convolution is split into two steps:first,filtering is realized by depthwise convolution;then,the feature map is extended to each channel by pointwise convolution.Although the deepwise separable convolution is realized by two convolutions,the size of the convolution kernel used is much smaller than that of standard convolution,which greatly reduces the computations and parameters.2.The network of action recognition is improved by the attention mechanism,which is integrated into the network of action recognition to improve the network performance with less computational cost.In this paper,the CBAM module for the2 D tasks is extended to the 3D structure and embedded into the network of action recognition.The 3D CBAM structure redistributes the weight of extracted feature maps from the spatial dimension and the channel-wise dimension strengthens the useful information and suppresses the useless information,to improve the feature saliency of the network.
Keywords/Search Tags:action recognition, deep learning, depthwise separable convolution, attention mechanism, 3D CBAM
PDF Full Text Request
Related items