Font Size: a A A

Research On Human Action Recognition Based On Graph Convolution Network And Target Detection

Posted on:2023-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:S G SunFull Text:PDF
GTID:2568306617461494Subject:Integrated circuit engineering
Abstract/Summary:
With the continuous improvement of social informatization,human action recognition technology has been widely used in intelligent security,human-computer interaction and sports analysis,which plays an important role in preventing safety accidents and maintaining social stability.Among them,the skeleton action recognition method based on graph convolution network has become the mainstream method in the field of action recognition because of its strong spatio-temporal feature extraction ability and strong expression of human skeleton data on actions.However,in practical applications,action recognition is often difficult to fully meet the application requirements of specific scenes due to the complexity and variability of human motion.It is still a great challenge to accurately distinguish human action categories.This paper focuses on some problems existing in the existing human action recognition technology.On the basis of Spatial Temporal Graph Convolutional Networks(ST-GCN),the network structure is improved and the target detection mechanism is introduced.The specific research contents are divided into the following three parts:(1)Aiming at the problem that ST-GCN network ignores the correlation between human non-physical connection joints and fails to make full use of human high-order skeleton information,this paper improves the calculation model of spatial graph convolution layer,and expands the neighborhood of joint points by polynomial to expand the receptive field of graph convolution and gather more feature information.At the same time,according to the coordinates of joints,the bone information and joint time difference information are expanded,and the bone time difference information is further expanded.These skeleton high-order information are input into the network,and action recognition is carried out by multi-stream structure fusion.Experiments on KTH dataset and Kinetics dataset show that,compared with ST-GCN,the proposed multi-stream information enhanced graph convolutional network model has significantly improved action recognition accuracy.(2)Aiming at the problem that the human action recognition method based on graph convolution network only uses the human skeleton data and loses the semantic information of the object,which makes it impossible to effectively distinguish the similar actions of skeleton posture(such as mobile phone and touch head),this paper introduces the target detection mechanism in the human action recognition task.The YOLOv5 loss function is improved and used to extract semantic information of objects in images.The results show that the improved YOLOv5 network has good detection effect for small target recognition of mobile phones,and can meet the real-time and accuracy requirements of object semantic information acquisition in action recognition tasks.(3)On the basis of the above research,this paper further proposes a mobile phone action recognition method based on graph convolution network and target detection.Firstly,the OpenPose algorithm is used to extract the coordinate values of the joint points of the skeleton and fill the missing values,and the multi-flow graph convolution network is used to identify the initial action category of the personnel.Secondly,the improved YOLOv5 network is used for target detection of mobile phone objects;then,judge the interaction between people and mobile phones;finally,action recognition results are output by decision fusion.The proposed method is verified by constructing the mobile phone action data set.The research results show that compared with the action recognition method only based on graph convolution network,the proposed strategy in this paper can effectively distinguish the actions with similar skeleton postures,and greatly improve the recognition accuracy of the interactive actions of such characters as mobile phones.
Keywords/Search Tags:Human action recognition, Attitude estimation, Graph convolutional networks, Target detection, Decision fusion
Related items