Font Size: a A A

Research On Short-Text Aspect Extraction Based On Topic Model And Attention Mechanism

Posted on:2020-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:K YeFull Text:PDF
GTID:2428330575454956Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sentiment analysis has always been one of the hotspots and difficulties in the field of text analysis.It has been widely used in recommended algorithms,public opinion monitoring.Because aspect extraction is a key sub-task in sentiment analysis,improv-ing the performance of aspect extraction has a vital impact on the final performance of sentiment analysis.Traditional aspect extracting mainly focus on long text data,such as newspapers,magazines,papers and so on.With the development of Wechat,Twitter,Taobao,etc,the popularity of short text data has shown explosive growth.Due to the sparsity and imbanlance of short text data,the traditional extraction model is no longer applicable.Therefore,aspect extraction for the short-text is badly in need of.This pa-per focuses on short text aspect extraction based on unsupervised learning.The main contents of this paper are as follows:1)Aiming at the limitations of traditional aspect extraction algorithm in pro-cessing short text data,this paper proposes an improved aspect extraction algorithm BiDTM-AE based on BTM.The traditional BTM aspect extraction algorithm regards all words as the same when generating word pairs,ignoring the influence of low-frequency words and non-aspect words,and ignoring the correlation information be-tween word pairs.Two improvements are proposed in this paper:first,introducing word pairs to discriminant model to weaken the influence of low-frequency words and non-aspect words on model;second,introducing bidirectional recurrent neural network to train word pairs ahead of time.Interrelation is modeled as a priori knowledge.In this paper,two standard data sets are used to verify that the introduction of words to discriminant models and bi-directional recurrent neural networks can significantly im-prove the performance of models;2)Aiming at the problem of sparse short text data and insufficient context in-formation,this paper proposes an improved aspect extraction algorithm TEAM-AE.TEAM-AE algorithm introduces word co-occurrence network to enrich the context in-formation of word,and adds potential topic information to words through joint training of word embedding and topic embedding,which alleviates the problem of polysemy in word embedding.The joint training of word embedding and topic embedding makes the same word fall into different aspects under different topics.Experiments show that the performance of TEAM-AE algorithm is significantly improved compared with the traditional aspect extraction algorithm;3)In order to improve the quality of aspect word,this paper introduces attention mechanism into aspect extraction model.The introduction of attention mechanism re-duces the influence of non-aspect word in the process of word clustering by increasing the weight of aspect word,which has a significant effect on the final performance of the model.
Keywords/Search Tags:Aspect Extraction, Topic Model, Word Embedding, Attention Mechanism
PDF Full Text Request
Related items