In recent years,visual technology has developed rapidly,various new visual equipment has been popularized,and visual information as the main form of information interaction is all over human daily life.As one of the important directions in the field of visual research,human action recognition technology has important significance in many aspects such as public safety,motion analysis,human-computer interaction and future education.There are interference from complex factors such as lighting,pixels,and shooting angle in visual data,and the processing of massive data is a difficult problem on the road of human action recognition research.Advances in technology have made it easy to extract human skeleton data,and skeleton data is more advantageous than RGB video data when dealing with recognition in complex contexts.Skeleton data is a kind of non-Euclidean structure data,the use of convolutional neural network and recurrent neural network in the past is not ideal for the recognition effect of its data,graph convolutional neural network processing non-Euclidean structure data has obvious advantages,and gradually become a neural network structure widely used in the field of action recognition in recent years.In this thesis,the action recognition method based on skeleton and graph convolution is studied,the spatial and temporal information extraction module is optimized and improved,and the attention mechanism is introduced to screen effective feature information.The main work content of the thesis is as follows:1.Firstly,the superiority of skeleton data compared with other commonly used modes is compared and analyzed,and its acquisition method and the format of input network are explained.2.Then,aiming at the problem that the traditional algorithm cannot fully learn the potential information of the skeletal map data,an adaptive module is introduced,so that the network can independently learn the features between joints according to the sample,and a local residual module is added to further explore more feature information of the skeleton.3.Finally,in order to highlight the characteristics of each joint and eliminate the interference of redundant information,the attention mechanism is introduced to assign attention weight and screen effective information from it.The idea of expansion convolution is introduced on the time extraction module to fuse more scale time series information,which significantly improves the recognition effect. |