Font Size: a A A

Research On Human Action Recognition Based On Depth Information

Posted on:2019-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhaoFull Text:PDF
GTID:2428330548976066Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The main task of human action recognition is to use computer vision to intelligently analyze the target video containing pedestrians,so as to identify different human action and make intelligence processing.The image for human action recognition mainly includes RGB image and depth image.RGB image is lack of human three-dimensional spatial information,and is sensitive to light intensity and background.So,its application effect and scope is greatly limited.Depth image is only related to the spatial position of the object.The interference,caused by environmental changes,is very small and the depth image can better identify human action.In this paper,some problems in human action recognition,based on depth image,were studied.The main research work of this paper includes:(1)For problems such as the instability of action details,loss of information in the time dimension and differences in the speed of the same action,this paper proposed a human action recognition method based on the new projection strategy and energy homogeneous video segmentation.This method not only had a higher recognition rate,but also consumed less time.Aiming at the instability of depth motion map(DMM)characterized action information,a new projection strategy was proposed.The position information and quantity statistics of the original depth image were simultaneously reflected on the side projection and the top projection,improving recognition rate.In order to solve the problem of loss of time information in calculating DMM of the entire video,taking into account the effect of the speed and amplitude of the action on recognition,this paper constructed a three-level time pyramid and performed video segmentation based on energy homogenization.Finally,the texture details of the DMMs of multiple sub-video sequences were described by the local binary pattern(LBP).The action recognition was accomplished by using support vector machine(SVM).The recognition rate of this method on MSRAction3 D database and MSRGesture3 D database was 94.55% and 95.67%,respectively.(2)For the problem that the loss of detail information of long-time video sequence DMM was still serious and finding a more appropriate feature representation,a human action recognition method combining multi-scale directed DMM and Log-Gabor was proposed.For the problem of serious loss of the detailed information of the long video sequence DMM,a new strategy was used to build a three-level time pyramid that can represent more detailed information and obtained a multi-scale DMM.At the same time,in order to make the DMM express the action direction,it was proposed to use the forward DMM and the backward DMM to reflect the forward and backward motion information respectively.Finally,the Log-Gabor with better performance of texture representation and more in line with the visual characteristics of the human eye,was used to describe the texture details of the DMM.The action recognition was accomplished by collaborative representation classifier(CRC).Compared with Method 1,the recognition rate of this method on the MSRAction3 D database and the MSRGesture3 D database was increased by 1.50% and 1.25%,respectively.(3)For the above two traditional methods of feature selection difficult,over-reliance on the application scenarios,based on the theory of deep learning,a human action recognitionmethod based on improved DMM and convolutional neural network was proposed.First,in order to overcome the problem that the network model was easy to overfit due to the small number of dataset samples,the data augmentation technology was used to simulate the camera angle change within a certain range of angles,and the different speeds of the same action can be simulated by changing the time interval of the DMM differential image.Second,when the entire video information was compressed into a frame of DMM image,weight variables were introduced in the series of differential images obtained by the DMM to retain the time dimension information of the video.Then,the modified DMM of the three projection planes was pseudo-color processed and input to three independent VGG-16 networks for fine-tuning the parameters.Finally,the high-level features of the extracted VGG-16 network were merged and input into the SVM for action recognition.Compared with Method 1,the recognition rate of this method on MSRAction3 D database and MSRGesture3 D database was increased to 2.30% and 1.83%,respectively.
Keywords/Search Tags:human action recognition, depth motion map, Log-Gabor, collaborative representation classifier, VGG-16
PDF Full Text Request
Related items