| In recent years,gesture recognition-based human-computer interaction(HCI)technology has gained popularity among users due to its simplicity and convenience.It has found applications in various fields,including smart cockpits and smart homes.Gesture recognition systems require high accuracy and real-time performance.To address these requirements in the context of a car cockpit,this paper focuses on network lightweight and adopts a deep learning algorithm for gesture recognition.The specific research work is as follows:(1)The Nus hand posture dataset is inadequate for the vehicle environment due to its small size and chaotic background.To overcome this limitation,the original dataset is expanded using techniques such as symmetry and rotation.Additionally,a gesture dataset suitable for in-vehicle scenes is established by capturing gesture images using a mobile phone camera and merging them with the original dataset.(2)YOLOv3,a widely used object detection framework,suffers from slow detection speed and a large feature extraction network.To address these issues,this paper proposes the YOLOv3-Mobile Net V3 algorithm.It replaces the feature extraction network of YOLOv3 with the lightweight Mobile Net V3 network,which significantly reduces the number of parameters in the network.The algorithm incorporates depth separable convolution and SE modules.The paper also analyzes the anchor frame mechanism and uses the k-means clustering method to analyze the static gesture dataset,resetting the prior frame size accordingly.Furthermore,the paper introduces the CIOU metric to optimize the model parameters.The improved algorithm achieves a recognition precision rate of 94.3% and a recall rate of 94.4%.(3)For the recognition of dynamic gestures,which have a sequential nature,a Mediapipe-3DCNN-LSTM network model is designed.The model leverages Mediapipe for key frame extraction,uses 3D convolution to capture spatial information from gesture sequence images,and incorporates a double-layer LSTM structure to enhance the network’s ability to process temporal information.This approach effectively exploits the relationship between gesture image sequences and achieves accurate and efficient dynamic gesture recognition,with an average recognition rate of 96.4%.(4)The paper concludes by presenting the design of the gesture recognition system’s interface,which enables centralized control of the entire system.Due to resource limitations,the interaction between gestures and PPT(Power Point)documents is used for verification.The results demonstrate that the system accurately recognizes commands and effectively facilitates human-computer interaction. |