| Due to auditory impairment and vocal cord damage,individuals who are deaf and mute are unable to communicate orally,thereby making gestural language the primary mode of communication among the hearing-impaired population.Currently,sign language is not widely disseminated in society and those with hearing loss often have difficulty in acquiring literacy skills,which results in significant communication barriers between the general populace and the deaf and mute community.As such,investigating sign language recognition technology holds paramount social significance as it not only provides convenience for the hearingimpaired,but also facilitates their seamless integration into public service scenarios and daily life.Moreover,it promotes the development of an information barrier-free communication society and the welfare industry.The computer vision-based sign language recognition can be divided into two categories: one is the recognition of sign language images,and the other is the dynamic recognition of sign language videos.The purpose of this article is to propose new improvement methods to overcome the limitations of previous sign language recognition algorithms and build a sign language recognition application based on improved accuracy.The research outcomes of this article can be divided into the following three parts:(1)Static Sign Language Recognition: In response to the demand for recognition accuracy in the current sign language application scenario,this paper propose a static alphabet sign language recognition method called SA-YOLOv5 based on the improved version 6.0 of the YOLOv5 object detection algorithm.Firstly,this paper optimized the baseline network of YOLOv5s6 and embedded the Sim Am attention mechanism at the end of the Backbone,which improved the feature extraction capability without introducing additional parameters,as confirmed by experiments.Secondly,to fully utilize the different scale features of the targets,this paper used the Adaptive Spatial Feature Fusion(ASFF)module to enhance the network feature fusion.Compared with the YOLOv5s6 baseline network,the improved SA-YOLOv5 sign language recognition model achieved an overall increase of 3.7 percentage points in m AP average precision,and showed promising results.(2)Dynamic sign language isolated word recognition: To address the problems of interference caused by redundant information in complex background sign language video streams and high model complexity in previous research,this paper propose a network architecture called M-LRCN.Firstly,this method uses the Media Pipe-based sign language video human pose preprocessing method to extract continuous frames of preprocessed sign language videos as inputs to the LRCN sign language recognition model.Secondly,to address the issue of large model parameters and low accuracy in the LRCN network model,the Alex Net for spatial feature extraction in LRCN is replaced with the lightweight Mobile Net V3 network,which further reduces the model parameter size.In the top 100 classes of isolated gestures in the CSL dataset,the accuracy of this method reached 91.75%,Compared with the Alex Net-LSTM model,the accuracy was improved by 11.95%,and a model size reduction of approximately 88.5%,further reducing the scale of the model parameters.(3)According to the improved algorithm mentioned above,this paper used frameworks such as PYQT,Android,Tensor Flow Lite to implement static and dynamic sign language recognition applications on desktop and mobile platforms respectively.this paper also verified and adjusted the system interaction interfaces of each client in actual use.Through these improvements,the use of sign language recognition systems has become more convenient and faster,better meeting the needs of users in different scenarios. |