Font Size: a A A

Research On The Method Of Sign Language Recognition Based On 3D CNN And Attention Mechanism

Posted on:2021-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2428330611988316Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Video level sign language recognition is widely concerned as one of the hotspot problem in computer vision and pattern recognition.Sign language recognition is a challenging subject with high difficulty which involved many research fields such as computer vision,pattern recognition,video acquisition and processing and natural language processing.Sign language recognition,by processing the video data collected in sign language and then translating them into words or voices,promotes the communication between the deaf and the hearing people,which is of great significance for maintaining the harmonious development of society.Sign language recognition,but the precision of sign language recognition still needs to be improved due to the flexibility,details and strong timing requirements of sign language behavior.This paper proposed a sign language recognition method based on 3D convolutional network and attention mechanism through in-depth analysis of sign language behavior,verified and evaluated the sign language dataset,its main research contents and contributions are as follows:(1)Aiming at the temporal requirement of sign language recognition and the difficulty in distinguishing features,a method for identifying isolated words in sign language based on 3D residual convolutional neural network was proposed.By virtue of the powerful autonomous learning ability of 3D convolutional network,the artificial design feature was avoided and adaptive learning was realized.The original RGB video stream was used as input,the video stream was segmented by sliding window,the spatio-temporal features were captured simultaneously by threedimensional convolution network,and finally the classification was carried out.The effectiveness of this method was verified.(2)In view of the complexity of details and the uncertainty of sign language movement changes in the process of sign language recognition,this paper focused on the design of hand characteristics according to the characteristics of human visual attention,and proposed a local sign language recognition algorithm based on RCNN target detection network.This algorithm combined the target detection area to detect and locate local rival.It based on the three dimensional convolution hand sequence by network time-series modeling.Experiments showed that the model can compensate for the details of general sign language recognition and improve the recognition results,especially in the recognition of complex and changeable gestures.(3)Based on the above two points,this paper put forward a kind of weak supervision and feature coding network AM-ResC3 D global-local sign language recognition method,time and space to sign language feature extraction and classification,on the basis of three dimensional residual network into the attention model,through the whole sign language video for time-series modeling,and to focus on the key period of video sequence,using the algorithm of end-to-end attention to different temporal characteristics of polymerization,obtain better sign language characteristics,finally realized the accurate prediction of sign language behavior.Experiments showed that this method can effectively combine the time sequence information of different levels and improved the recognition accuracy and generalization performance.
Keywords/Search Tags:Sign language recognition, 3D CNN, Attention mechanism, Target detection, Time-series modeling
PDF Full Text Request
Related items