Font Size: a A A

Intention Identification Based On Microblog

Posted on:2018-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:C X LiFull Text:PDF
GTID:2348330518966600Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Microblogging is a new social platform,where hundreds of millions of users release massive microblogging data every day.In these massive microblogging,some microblogging has a certain intention and they are expressed using an explicit or implicit way.Identifying the intentions of these microblogging accurately has great commercial value.This paper focuses on the identification of the intention in microblogging,and three aspects of the study are following:(1)Recognition of explicit intentions in microblogging.Microblogs with explicit intentions usually have the intentional trigger words "want","hope" and so on.In this paper,we propose a new explicit intention recognition model based on Wikipedia.For each intention,we first select some of the concepts that are able to represent this intention,that is,the seed concepts.And then put the seed concepts to the Wikipedia as query.Through the concept of the link between the relationship,we can get the concepts associated with the seeds.The concepts of these extensions also have the same intention to a certain extent.Then we use the acquired concept set to construct the corresponding intention of the Wikipedia link diagram.With a random walk algorithm,each concept are assigned an intention score.Finally,we map the microblogging to the corresponding intention space to get the appropriate intention score,we use the intention score to judge whether the microblog has a corresponding intention.If some of the words in microblogging are not included in Wikipedia,we use explicit semantic analysis(ESA)to map it to the most relevant Wikipedia concept and then map the corresponding intent score.(2)The identification of implicit intentions in microblogging.Microblogs with implicit intentions usually do not have the intentional trigger words,but we can get the intention of microblogging by reasoning.At present,most of the research work is aimed at the recognition of explicit intention.In this article,we conduct the identification of implicit intentions in microblogging.We use the encoder-decoder model to "translate" the microblogging with implicit intent as the expression of the corresponding explicit intention and then identify the explicit intent.The encoder-decoder model is mainly used to solve sequence-to-sequence problems(seq2seq),such as machine translation,speech recognition,image description,etc.The conversion between implicit intention and explicit expression also belongs to the seq2 seq problem,then we can use the encoder-decoder model.The main idea of the traditional RNN encoder-decoder model is to encode the input sentence into a fixed length semantic vector,and then decode the semantic vector to generate the corresponding output sentence.The attention model proposed by Bahdanau et al.later improves the traditional RNN encoder-decoder model by encoding the input sentence into a semantic vector of not-fixed length,which makes the translation robust even ifthe length of the sentence is long.In the experiment,we compare the two models,and the results show that the attention model is better than the RNN encoder-decoder model.In order to train the model,we construct a corpus containing the implicit intention of microblogging and the corresponding explicit intention microblogging.Once we obtain the explicit expression of the intention,we can use the Wikipedia-based explicit intention recognition model proposed in this paper to identify the explicit intent.(3)Identify the intention of microblog.We propose a kind of intention recognition model based on word embedding and convolution neural network.This model is versatile and can be used not only to identify explicit intent,but also to identify implicit intent.The versatility of the model is mainly due to two aspects.On the one hand,the word embedding of the word has rich semantic features.On the other hand,the convolution neural network can extract the semantic features of the sentence.Therefore,we consider the problem of identification as a multiple classification problem,that is,whether a microblog has a certain intention,the model can be used to classify the microblogging,regardless of the way of the intention expression.The word embedding and the convolution neural network model can extract the semantic features and then perform the correct intention recognition.
Keywords/Search Tags:Explicit Intention, Implicit Intention, Encoder-Decoder Model, Word Embedding, Convolution Neural Network
PDF Full Text Request
Related items