Font Size: a A A

Research On Continuous Sign Language Translation Based On Temporal Neural Networks

Posted on:2020-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:A Y LiFull Text:PDF
GTID:2428330575996930Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Nowadays,with fast development of big data technology and continuous improvement of computer hardware operation speed,artificial intelligence and machine learning technologies are developed continuously and rapidly.In the field of computer vision,machine learning algorithms based on neural network and depth learning are capable of understanding multimedia information such as images and videos by simulating human brains.In recent years,neural network models have made breakthrough progress on video translation with strong learning ability of fitting and regression.An important branch of computer vision field is continuous sign language translation,which has significant practical value in real life.Sign language helps deaf-mutes to communicate conveniently in daily life.Thus the continuous development of continuous sign language video translation provides convenience for deaf-mutes and realizes free communication between hearing impaired people and normal people.Sign language translation,as a way of human-machine interaction,translates continuous sign language actions into corresponding text sequences via machine learning algorithm.Continuous sign language translation is a generalized Sequence-to-Sequence problem.Its difficulty lies in the recognition of visual information in video,which needs to not only takes the image frame information into account at the current moment,but also relates to the complex dynamic relationship between successive frames.In this paper,recurrent neural network algorithm is used to conduct the timing modeling by using the coding and decoding structures and CTC(Connectionist Temporal Classification)realtime translation structure.In the coding and decoding models,the timing pooling operation is proposed and embedded in the layered encoder of the translation system,which effectively alleviates the information redundancy of continuous video data and significantly improves the translation efficiency and effect.Considering the vanishing gradient problem of the coding and decoding models in the long-term sign language video translation,this paper then proposes an end-to-end two-way parallel learning model based on CTC optimization method.In the parallel two-stream network structure of the CNN(Convolution Neural Network)and RNN(Recurrent Neural Network),CNN module focuses on the local perception on two-dimensional images,while RNN focuses on the timing modeling of continuous actions,highlighting the inherent relationship of global sequence transmission in the time dimension.In the end,the score matrices output by the two modules are fused,and the translation sentence sequences are output end-to-end in combination with CTC optimization method.This method can effectively give consideration to both long-term and short-term spatiotemporal visual information of the video and achieve a good result in long-term continuous sign language video translation tasks.
Keywords/Search Tags:sign language translation, sequential learning, temporal convolution, connectionist temporal classification, two-stream neural network
PDF Full Text Request
Related items