| The action recognition based on human skeleton sequence is a hot and practical research problem in the field of artificial intelligence application.At present,this problem has a lot of applications in video supervision,body sense games,patient monitoring,unmanned security,human-computer interaction,machine operation and other fields.With the development of human skeleton data acquisition equipment and motion capture sensor,the sensor can effectively obtain the dynamic human skeleton sequence.Therefore,it is urgent to design an algorithm of action recognition which can make full use of human skeleton sequence.Human skeleton sequence fully represents the representation of human information,but the initial skeleton sequence often has problems such as skeleton tilt,joint point loss,blank frame,etc.The previous work has failed to properly preprocess the existing problems of these initial human skeleton sequences.The time or space dimension of skeleton sequence contains abundant human information.How to design an algorithm model with higher recognition accuracy to make full use of the space-time feature information of human skeleton is a core problem to be studied in this paper.In addition,the execution of a kind of human action usually only needs to be completed with the joint of some human joints.Therefore,how to design a joint point attention network to generate different response maps according to different action sequences is also an important way to improve the algorithm of human motion recognition.The main work of this paper is as follows1.This paper corrects and fuses the original skeleton sequence data and preprocesses the data,solves the problem of camera shooting angle tilt and missing skeleton point,makes the skeleton data more suitable for network training,thus laying a solid foundation for improving the accuracy of the action recognition algorithm based on human skeleton sequence.2.In this paper,a new method of encoding the temporal and spatial characteristics of human skeleton sequence is proposed by using the inter-frame vector feature representation and the intra-frame vector feature representation.The residual block of TCN is redesigned,and a two-stream temporal convolution neural network(TS-TCN)which can integrate multiple feature representations is proposed.3.In this paper,a joint attention graph convolutional neural network(JA-GCN)is proposed,which generates different graph topology structures for different types of human actions in the end-to-end way. |