Font Size: a A A

Application Of Encoder And Decoder Network Based On Tensor Decomposition In Sign Language Recognition

Posted on:2022-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:B XuFull Text:PDF
GTID:2518306323979729Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Sign language is a method for the hearing impaired to use gestures instead of voiced language to communicate.It is the main way to communicate between the hearing impaired and between the hearing impaired and the normal hearing.It mainly uses information such as hand shape,orientation,position and movement track To convey the meaning of a sign language word.If the obtained sign language video can be converted into text or sound information through sign language recognition technology,it can provide an effective way for the hearing impaired to communicate with the society,and it will help strengthen the deaf and normal hearing.The communication between them helps them to get out of the predicament and create a normal working environment for them,which has important humanistic significance.With the rapid development of computer technology,interactive activities between computers and people have become more and more important.Among them,sign language recognition technology is one of the important ways of Human Computer Interaction(HCI).Compared with the traditional way of human-computer interaction,it has the advantages of being more natural,rich and concise.In recent years,encoder-decoder network has achieved a good recognition accuracy in sign language recognition tasks,but its large number of parameters and slow training speed hinder its application to be embedded or in mobile devices.Therefore,this paper adopts tensor train decomposition and block term decomposit-ion to compress the sign language recognition model based on codec network under the condition of ensuring the accuracy,according to the sample data of Chinese sign language obtained by Kinect.The main findings of this paper include:1.The tensor train decomposition is introduced to compare the encoder-decoder network on the model,respectively on the connection layer and two LSTM layers with the form of tensor train.A two-layer fully connected layer network is built on MNIST data set to explore the influence of parameters on tensor train decomposition.For tensor train decomposition,the best performance is achieved when the input tensor is arranged from large to small and the output tensor is arranged from small to large.Compared with the original model,the accuracy rate is greatly improved and the number of parameters also drops sharply.It can be found that when the full connection layer and the first LSTM layer is changed via tensor train decomposition,the model performance is best.Compared with the original model,the parameters' number is reduced by 49.5%,but the accuracy is still 95.6%.2.Block term decomposition has stronger ability to capture the local connection compared to tensor train decomposition,and there is no difficulty in choosing tensor rank,so this study replaces tensor chain decomposition with block term decomposition.For block term decomposition,the order of parameters is just opposite to that of tensor train decomposition.It can also be found that when the whole connection layer and the first LSTM layer is changed by block term decomposition,the model performance is best.The number of parameters is reduced by 51.5%and the accuracy is 94.7%.
Keywords/Search Tags:Sign Language Recognition, Long Short Time Memory, Encoder-Decoder Network, Tensor Train Decomposition, Block Term Decomposition
PDF Full Text Request
Related items