| In recent years,computer technology and network technology are developing vigorously,and computer vision is widely applied in various scientific fields.Human action recognition is an important branch of computer vision.Its essence is to correctly classify the human body motion information in video,and it is of great significance in the fields of intelligent monitoring and security,human-computer interaction,motion analysis and so on.At present,the human action recognition method still has the following problems:firstly,how to extract and characterize motion information in video is one of the difficulties in the research field of human action recognition.Secondly,with the emergence of depth cameras such as Kinect,researchers have been provided depth information of human action video,and how to effectively use these depth information to identify and classify human motion is also an important research issue.Finally,when sample data volume is small,how to use a deep learning network model to realize higher human motion recognition accuracy.Based on UTD-MHAD dataset,this article focuses on the study of human action recognition using RGB video and depth video captured by Kinect at the same time.In response to the above issues,this article has carried out related research.The specific content is as follows:(1)For RGB video,in order to obtain both spatial and temporal information of motion,an improved motion history imaoge method is proposed:removes end-to-end redundant frames of the video,preserves key frames,and then extracts gray-scale motion history images(MHI).Next,we use rainbow coding to pseudo-colorize it to enhance perceived quality.Finally,vertical mirroring and noise are added to the data to augment the data.Through experimental verification,the recognition accuracy of the improved motion history image method has been improved by 14%,which verifies the effectiveness of the method.(2)For depth video,this article first rotates each three-dimensional pixel point by a certain angle,simulates different viewing angles,and increases the amount of data.Each frame of depth video is then projected onto three orthogonal planes and depth motion maps(DMMs)are obtained to characterize the motion information.Next it is rainbow-coded to enhance the data.Experimental data show that this method can effectively improve the recognition accuracy of depth motion maps.(3)To make full use of motion information extracted from RGB video and depth video,we constructs a four-layer parallel network,which uses the color MHI,the front view,the side view and the top view of depth motion maps as the input data respectively,and selects the appropriate convolution neural network model for the fine tuning through the experiment.(4)In order to compare the performance of the two information fusion methods,feature fusion and decision fusion,experiments were conducted separately.First,the results of different feature fusion methods are compared.Then,different fusion rules(weight rules,average rules,multiplication rules,etc.)are used for decision fusion to obtain the final classification results.Finally,two kinds of verification methods(cross-target verification and same-target verification)were also tested to show the intra-class differences of the samples.(5)A human action recognition software is developed on the MATLAB programming platform under Ubuntu environment,and the effectiveness of the method is proved by experiments. |