| Dynamic gesture recognition is one of the main directions in the field of human-computer interaction.At present,the traditional deep learning method for dynamic gesture recognition has low recognition accuracy and low recognition efficiency.It has become an urgent need to develop a new continuous dynamic gesture recognition method.In view of this situation,the dynamic gesture recognition method of spatiotemporal convolutional neural network in the context of dynamic gestures is studied in the thesis.In this thesis,a gesture video segmentation algorithm based on relative displacement changes of key points(KPRD-VS)is proposed for the difficulty in recognition caused by continuous gesture transformation.In the algorithm,Blaze Palm is used to calculate the coordinates of the key points of the hand,and perform gesture video segmentation according to the relative displacement of the key points of the hand.The algorithm has an accuracy rate of 96% in the segmentation task of continuous dynamic gesture video.Aiming at the problem that the current dynamic gesture recognition model is difficult to apply to the real production environment,a lightweight model of spatiotemporal convolutional neural network(3D CNNsTCN)combining 3D CNN and TCN is proposed,through the 3D CNN network model that extracts spatiotemporal features and the TCN network that processes timing information,and deep spatio-temporal feature extraction is performed on dynamic gesture videos.The accuracy of the model on the self-built dynamic gesture data set Mouse Hands reached 98.89%.In this thesis,based on the KPRD-VS dynamic gesture video segmentation algorithm and3 D CNNs-TCN dynamic gesture recognition model,a human-computer interaction system is designed and implemented.It is the function of the system to directly drive the mouse operation with dynamic gestures.The system tested the accuracy of gesture semantic recognition under complex background and bright and dark light.The system tested the gesture semantic recognition performance of the system under complex background and bright and dark light.The average recognition accuracy rate reached 92% under various scenes and lighting conditions,which can meet actual work requirements. |