Font Size: a A A

Human Action Recognition And Detection Based On Images Mapped From Skeleton Sequence

Posted on:2020-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:B X HouFull Text:PDF
GTID:2428330572971530Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Human action recognition and detection technology is an important research direction of computer vision,and has broad application prospects in the fields of intelligent monitoring,video retrieval,human-computer interaction,and robotics.With the development of low-cost deep somatosensory cameras(such as Kinect),skeleton-based human action recognition and detection has attracted more and more researchers'attention.Compared with color images and depth images,the human skeleton can better reflect the movement trajectory of the human body and overcome the effects of illumination changes and background noise.However,how to represent such high-dimensional time series information into a data form suitable for action recognition and detection algorithm processing is still a problem to be solved.In addition,most human action research work currently targets action recognition of segmented video clips.However,actual video sequences tend to be continuous,and some real-time applications also require online detection of human action.In order to solve the above problems,the time information and spatial information of the skeleton sequence are respectively represented as rows and columns of images,and a skeleton map is generated.In the action recognition task of the segmented video segment,the discriminant feature of the human motion is learned from the skeleton map by using a convolutional neural network to obtain an action classification model.In the continuous video sequence,the action classification model is used as a feature extractor,and the target action is detected on the generated feature map using a time-proposed method.In real-time action detection,actions are identified by sliding the window and using the action classification model.The specific research contents and innovations includes the following parts.Firstly,two methods are proposed to encode the spatiotemporal information of the skeleton sequence into skeleton map.The skeleton map reflects the evolution of the human body posture over time,and the same action has similar texture,which lays the foundation for action recognition and detection.The constructed skeleton map includes a skeleton coordinate map and a posture change map based on pose dictionary.The skeleton coordinate graph is robust to the difference in body shape and translation.The posture change map reflects the change of the similarity between human posture and the pose dictionary over time.Secondly,an action classification model based on skeleton map and convolutional neural network is designed to improve the accuracy of action recognition.At the same time,in order to avoid over-fitting of convolutional neural networks,several methods for skeletal data enhancement are proposed.Considering the superiority of convolutional neural networks in the field of image recognition,convolutional neural networks are used to extract the co-occurrence features of joints and the time-varying features of poses for action recognition.The test is carried out on the public action recognition data sets NTU RGB+D and UTKinect-Action.The results show that the proposed method has advantages in recognition accuracy and model size.Thirdly,in order to reduce the number of candidate border and avoid redundant feature extraction,an end-to-end human action detection method based on timing proposal is proposed,which solves the problem of low computational efficiency of action detection in continuous video sequences.Based on the target detection algorithm Faster R-CNN,the detection method modifies the size of the multi-scale proposal window according to the action duration.The sliding window is adapted to input continuous action sequences of different durations,and the classification and regression networks are simultaneously trained in a multi-task learning manner.Experiments on the PKU-MMD dataset show that the proposed method can effectively deal with continuous action detection.Finally,using the action classifier trained on the segmented video segment,the online action detection is realized by sliding the window and combining the discriminating conditions of the action occurrence,which satisfies the real-time requirement of the interactive system.The human-computer interaction experiment is performed on the service robot,and the system can capture and identify 9 types of actions related to user health in real time,and synchronously generate service instructions of the service robot according to the result of the action detection,thereby realizing intelligent service of the robot.
Keywords/Search Tags:human action recognition, skeleton map, human action detection, human-computer interaction, pose dictionary
PDF Full Text Request
Related items