| Gesture,as a visual language,has been widely used in human-computer interaction and computer vision due to its simplicity,intuitiveness,and rich representativeness.Currently,the use of neural network methods for gesture recognition is receiving increasing attention.However,in the field of gesture recognition,the background of the dataset image is relatively simple,the change of gesture angle is small,and the network computation is large,the parameters are more,and the real-time recognition speed is slow,so lightweight network model design is required.Therefore,this thesis has developed a gesture dataset with complex scenes,and explores fast and accurate methods for gesture recognition based on it.The main work is as follows:(1)A static gesture dataset for complex scenes was created based on the Open CV computer vision library.Due to the shortcomings of the existing gesture datasets such as a single background,few light changes,and insufficient feature richness,it is necessary to construct gesture datasets for complex scenes to make them more closely related to real-life scenarios.This thesis uses a series of image preprocessing and data enhancement functions of Open CV computer vision library to increase the feature credibility and improve the quality of the dataset.This thesis designs and produces a Hand gesture dataset with diverse backgrounds,different lighting,and multiple angles,which uses rich features to closely match the recognition effect of gestures in real life.(2)A gesture recognition method based on attention mechanism and lightweight network is proposed.This thesis aims to address the issues of high time consumption,large number of model parameters,and slow real-time recognition speed in existing network algorithms.It improves the lightweight Le Net5 basic network structure and adds Squeeze and Excitation(SENet)modules to improve feature selectivity and strengthen effective features.In order to reduce the redundancy and complexity of the network structure and reduce the number of feature channels in the fully connected layer,1 is used on the last convolutional layer × The convolution of 1 replaces the SENet module,accelerating the recognition speed of the algorithm.The experimental results show that the improved Le Net5 SENet(Le SN)network model has high recognition accuracy,but the real-time recognition speed still needs to be improved.(3)This thesis uses the deep separable convolution of the lightweight network Mobile Net to replace the ordinary convolution of Le Net5 as the feature extraction network,further improving the network structure.The improved Le Net5 Mobile Net(Le Mo)model has improved the computational speed and recognition accuracy;As Re LU and swish functions are not smooth,which may lead to gradient disappearance and explosion,Mish activation function is used to speed up the training speed and improve the generalization ability of the model.Experimental results show that the improved network can quickly complete real-time gesture recognition tasks while maintaining high recognition accuracy.In summary,this thesis has created a gesture dataset for complex scenes,exploring fast and accurate methods for recognizing gestures based on it,and improving two network models.Compared with other network models,the Le Mo network model improved based on Mobile Net in this paper has less improvement in recognition accuracy,but the recognition speed is faster,which is more suitable for real-time gesture recognition tasks,and the number of parameters is also lower,which is more convenient for mobile terminals and embedded devices to transplant. |