Font Size: a A A

Research On The Deep Learning Based Gesture Recognition,Hand Detection And Model Compression

Posted on:2021-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:H F LinFull Text:PDF
GTID:2428330611966523Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,human-computer interaction technology has been developing rapidly.Among all kinds of these technologies,gesture-based interaction is the most natural and in-tuitive one.Gesture recognition and human hand detection technologies are very important for this type of interaction.They have always been research hotspots in the field of computer vi-sion.Although they have been extensively studied,it is a challenging problem when confronted with complex background and occlusion in real-world scenarios.Deep learning is an effective technology for tasks such as gesture recognition and human hand detection.It has better gen-eralization ability and robustness when compared to the traditional algorithms.However,deep neural network models suffer from expensive computation complexity and large memory con-sumption,which hinders their deployment.To solve the above problems,this paper focuses on deep learning based gesture recognition,human hand detection,model compression and accel-eration algorithms.In this paper,we present a two-stage method to solve the problem of static gesture recog-nition under complex background,which consists of a hand pose estimator and a hand pose classifier.We use a kind of neural network named convolutional pose machine to locate the hand keypoints.The introduction of multi-stage sequence structure and intermediate supervi-sion enables it to make accurate predictions even under challenging situations.Fuzzy Gaussian mixture models are adopted to classify gestures into their corresponding categories based on the estimated hand keypoints.It also performs well in rejecting the non-target gestures.Be-sides,due to the two-stage design of the gesture recognition system,we can extend the gesture category easily.Before performing gesture recognition,it is usually necessary to locate the hand region in the given image firstly.To make a good balance between the speed and accuracy of the algo-rithm,we employ the one-stage object detection model,YOLOv3,for human hand detection.By using better pre-trained weights and data augmentation methods,the performance of the hand detection model is significantly improved,providing a strong baseline for the problem of model compression and acceleration.In order to further reduce the computing resources consumption and memory footprint of the hand detection model,we employ a channel pruning method named network slimming to perform model compression and acceleration on the YOLOv3 model.In the training phase,sparsity regularization is applied to the ? parameters of batch normalization layers to identify unimportant channels automatically.We make some adaptations to handle the skip connections in YOLOv3,which improves the flexibility of channel pruning.In addition,by using an adaptive pruning threshold method and absorbing the ? parameters of batch normalization layers into subsequent layers,the performance loss caused by pruning is greatly reduced.After merging the batch normalization layers with their preceding convolutions,the speed of model inference is further improved.By now,the memory footprint and computing resources consumption of the model have been largely reduced without notable performance loss.
Keywords/Search Tags:Deep learning, Gesture recognition, Fuzzy Gaussian mixture model, Hand detec-tion, Model compression and acceleration
PDF Full Text Request
Related items