Font Size: a A A

Research On Deep Learning Algorithm For Sequence Data

Posted on:2020-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:D J KongFull Text:PDF
GTID:1368330572996598Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the fast development of the Internet,many electronic information are accu-mulated,sequence data which contains strong relation among inner data points is one of them.How to mine the patterns and relations among sequence data and further use them to predict the future is one of the most important research area to promote the de-velopment of Artificial Intelligence and Society Intelligence.However,the successful sequence modeling needs to solve three major challenges:sequence sparsity,difficulty of long term dependencies mining and information transition and inference on long se-quences.To handle these challenges,we study sequence modeling problem and present a series of deep sequence models which focus on there aforementioned challenges and emphasis the ignoring of sequence interval and the lacking of information transition and inference.The motivation of our study is to use auxiliary information to improve and reform deep sequence models in order to promote models' prediction ability and apply them on real application scenarios.Specifically,the main contributions of this thesis are listed as follows:To deal with the sequence sparsity,we proposed a composite long-short term mem-ory network(LSTM).Firstly,we utilize the LSTM network to effectively encode two kinds of sequences:the(user,query)sequence and the query word sequence to repre-sent users' query intention in a continuous vector space and decode them as distributions over ads respectively.Then,we combine these two LSTM networks in an appropriate way to build up a more robust model referred as composite LSTM model(cLSTM)for ad recommendation.We proposed a spatial-temporal long-short term memory network(ST-LSTM)to deal with the ignoring of sequence interval.We emphasis the importance of sequence interval(i.e.spatial interval and temporal interval)on sequence modeling,proposed to embed spatial and temporal interval into vectors after discretizing them and combine these vectors with gate mechanism of LSTM to better mining the long term dependencies among sequence data.Further more,we employ a hierarchical extension of the proposed model ST-LSTM in an encoder-decoder manner which models the contextual historic visit information in order to boost the prediction performance.We proposed a sequence generative adversarial network with spatial-temporal em-bedding.Following the idea of generative adversarial network(GAN)for seq2seq learn-ing,the ST-GAN model is proposed,and it takes the proposed ST-LSTM as the generator and the proposed spatial-temporal convolutional neural network(ST-CNN)witch uses spatial-temporal information to boost its discriminating ability as the discriminator.The minimax game of ST-GAN can produce more real enough data to train a better prediction model.To deal with the long term information retrieval and inference problem in sequence model,we proposed multi-turn attentional memory network.As we all know that many deep model benefit from attention mechanism to find more and better supporting facts.In visual dialog,most methods rely on one-turn attention network to obtain facts w.r.t.a question.However,the information transition phenomenon which exists in these facts restricts these methods to retrieve all relevant information.In this paper,we propose a multi-turn attentional memory network for visual dialog.Firstly,we propose a atten-tional memory network that maintains image regions and historical dialog in two memory banks and attends the question to be answered to both the visual and textual banks to ob-tain multi-model facts.Further,considering the information transition phenomenon,we design a multi-turn attention architecture which attend to memory banks multiple turns to retrieve more facts in order to produce a better answer.Besides,we experiment our proposed methods on real world sequence data which are generated from different applications,the results prove the effectiveness of the pro-posed methods on real world application scenario.
Keywords/Search Tags:deep learning, sequence modeling, prediction and recommendation, recur-rent neural network, long-short term memory, convolutional neural network, memory network, generative adversarial networks, embedding, attention mechanism, cross-model attention
PDF Full Text Request
Related items