| In recent years,thanks to the improvement of computer computing power,deep learning networks have achieved unprecedented development and have been widely used in various fields.In human-vehicle interaction,gesture recognition can effectively reduce the distraction of the driver,and at the same time have efficient expression capabilities.It is a new interactive language that takes into account driving safety and convenience.Image processing technology based on deep learning has the advantages of low preprocessing requirements,strong versatility,better real-time and accuracy,and so on.Therefore,this article selects deep learning methods to detect and recognize gestures in the car.At the same time,due to the low computing power of the current on-board computer,efficient data processing at a lower cost is an important consideration in the design of the human-vehicle gesture interaction system.In this paper,the image data collected by the monocular camera is selected as the input data of the recognition system.This article uses an RGB camera to capture pictures of nine gestures in different scenes.And through Labelme software to make labels for gesture pictures,obtained from the collected gesture data set,the number of pictures is 16,990.This article has developed a complete evaluation index to measure the accuracy and real-time performance of the gesture detection algorithm.In this paper,the YoloV3 algorithm is selected as the basic algorithm for human-vehicle gesture recognition.By building the YoloV3 theme framework and establishing the YoloV3 loss function,preliminary detection was performed on the self-collected data set.The average accuracy rate reached 83.59%,and the detection speed was about 42 frames per second.This article improves the YoloV3 algorithm from three aspects.1.The anchor frame of YoloV3 is trained through clustering on the Image Net data set,which is quite different from the width and height characteristics of gesture pictures.The K-means algorithm is used to regenerate 9 anchor boxes as preselected boxes for gesture detection.2.The deep separable convolution structure is introduced into the YoloV3 algorithm,and the IBL-YoloV3 algorithm is proposed to lighten the model structure and increase the richness of feature extraction.3.In order to test the stability of the detector,identify the mixed picture collection obtained by the crawler.The positive samples classified as negative classes are re-added to the training set as difficult samples.The improved algorithm has an average accuracy rate of 87.70% and a detection speed of 45 frames per second.Based on the completion of the gesture detection algorithm,this paper uses the ROS platform to build a human-vehicle gesture interaction system framework.The framework sets up an absolute synchronization mechanism between the camera node,the recognition node,and the execution node in the node manager,which solves the problem of inconsistent execution speed of image taking and image recognition,and ensures the stability of system operation.The modular programming design blurs the connection between the specific executive function and the recognition system.When users add new functions,they only need to focus on analyzing and processing the RSW sequence,which improves the secondary development capability of the system. |