Font Size: a A A

Research On Dynamic Chinese Sign Language Recognition Based On Deep Learning

Posted on:2024-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y L HuangFull Text:PDF
GTID:2568307064472044Subject:Mechanics (Professional Degree)
Abstract/Summary:PDF Full Text Request
Sign language is a unique and meaningful language created by human beings.It is one of the main ways of communication for deaf people and plays a huge role in their daily communication activities.With the advent of the era of intelligence,researchers have proposed many sign language recognition methods to promote smooth communication between deaf-mute groups and hearing groups.Sign language recognition research has high difficulty,involving computer vision,pattern recognition,natural language processing and other disciplines.In general,researchers use various algorithms to process sign language data and convert it into text or voice form,so as to realize barrier-free communication between deaf-mute groups and hearing-impaired groups,and help deaf-mute people better integrate into society and enjoy life.With the rapid development of deep learning and the integration of multiple research fields,sign language recognition research has also achieved remarkable development.At present,the main research directions of sign language recognition can be divided into dynamic isolated sign language recognition and dynamic continuous sign language recognition,among which dynamic isolated sign language recognition is the basis of realizing dynamic continuous sign language recognition.Dynamic isolated sign language recognition often uses convolutional temporal network method,but this method is difficult to learn the deep features and correlation features of sign language images,resulting in a low recognition accuracy of sign language.Dynamic continuous sign language recognition often uses the common codec network method,but this method is difficult to decode and align the video footage of sign language with isolated words,resulting in a high rate of word error in sign language recognition.To solve the above problems,this paper analyzes the dynamic isolated sign language and dynamic continuous sign language from two aspects,proposes a sign language recognition method based on fused attention mechanism network and three-dimensional convolutional time series network,and verifies it on Chinese sign language data set.Its main work content and contributions are as follows:(1)Aiming at the problems in the process of dynamic isolated sign language recognition,such as insufficient sign language representation ability,low weight of key areas of sign language action,and weak correlation before and after sign language action,a dynamic isolated sign language recognition network model based on fusion attention mechanism is proposed.The model first uses the autoencoder network to pre-train the sign language data to enhance the data representation ability.Then,combined with the characteristics of human eye attention,the spatial and channel attention network are embedded to enhance the learning of key areas of sign language.Finally,the bidirectional long-term and short-term memory network is used to enhance the relevance before and after the sign language action.Through experimental comparison,the sign language recognition test accuracy of the model reaches 89.90 %,which is better than the other dynamic isolated sign language recognition method,and verifies the effectiveness of this method.(2)Aiming at the problems in dynamic continuous sign language recognition,such as poor adaptive learning ability,difficulty in spatial-temporal feature extraction and poor alignment decoding method of sign language,a dynamic continuous sign language recognition network model based on three-dimensional convolutional time series network was proposed.Firstly,the model uses a three-dimensional residual convolution network to extract local image features and short-term spatio-temporal features of sign language video clips,thereby reducing manual intervention.Then,a bidirectional long short-term memory network is used to realize timing memory and feature encoding and decoding,thereby enhancing the ability of spatio-temporal feature extraction.Finally,the connection timing classification algorithm is used for timing modeling to optimize the sequence alignment decoding ability,so as to realize an end-to-end dynamic continuous sign language recognition method.Through experimental comparison,the error rate of this model is 23.83%,which is superior to the other dynamic continuous sign language recognition method,and verifies the effectiveness of this method.
Keywords/Search Tags:sign language recognition, deep learning, convolution neural network, attention mechanism, time series modeling
PDF Full Text Request
Related items