Font Size: a A A

Sentiment Classification And Topic Sentiment Evolution Analysis Based On LDA

Posted on:2018-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2348330536973564Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the era of rapid development of the Internet and big data,there will be more and more information from Internet media,including comments information,user information,news and so on.There exists high value behind these massive information.Through mining and analyzing these information,we can understand consumer's needs and psychology of user better and analyze the development of hot events and the trend of social public opinion,providing a certain basis of decision-making for businesses or governments.Most of the massive data is presented in the form of text.In these textual information,it usually carries the objective fact information and the subjective emotional color information of the users.Therefore,the mining based on text sentiment information is a research hotspot in recent years.Among them,if the text sentiment classification can automatically classify the subjective emotional information of the text,it can analyze the psychology of user better.The traditional classification methods have some shortcomings in the extraction of text features,for example,they don't consider the relationship between the text or the dimension problem of text features.In addition,the mining of topic-sentiment based on text is also one of the research hotspots.The topic model is also proved to be an effective method of text mining.Traditional topic models such as PLSA and LDA are these topic models commonly used in text mining.Traditional topic model can be modeled on text implicit topic,but it can't be suitable for different researches based on different textual information or research content.And it also brings some challenges in some facts in text mining.In view of the shortcomings of traditional text sentiment classification and topic sentiment mining,there have been many researches to improve these shortcomings in recent years.Among them,the classifier is improved about the sentiment classification of text and the LDA topic model is improved to mine the relationship between topic and sentiment.In this paper,the main work on the basis of the existing research has two aspects:(1)we mainly improve the text feature extraction method in Chinese and English comments data set,mainly combining LDA topic model with SVM classifier for text sentiments classification analysis;(2)in the Sina news data set,we take advantage of some attributes of news information,such as time,sentiment annotation,to extend LDA topic model to mine the relationship between topic and sentiment and analyze the evolution trend of the topic.In the aspect of text sentiment classification,this paper presents a new text feature extraction model,ELDA(External Knowledge-based Latent Dirichlet Allocation),which is a kind of weakly supervised topic model.We need to find appropriate external knowledge according to the contents of the experimental data set to use this model.At first,we use LDA topic model to extract the topic of external knowledge,and then we use ELDA model to extract the topic of the subject of the experimental data set and the external knowledge on the basis of the topic of the external knowledge.Here we regard the topic feature as the feature of the text and the choice of external knowledge can increase the weight of sentiment feature to a certain extent.According to the difference of the topic number and the SVM classifier,we can analyze the appropriate feature dimension of emotion classification and find the best text sentiment classification model.Experiment results show that this classification method has achieved better results both in the Chinese comment dataset and in the English comment dataset,which is 4%higher than the traditional text sentiment classification method in terms of accuracy.In this paper,a new topic model JTSoT(Joint Topic-Sentiment over Time)is proposed in the aspects of topic-sentiment mining and topic evolution analysis.In the aspect of topic-sentiment mining,this paper introduces the sentiment layer between the topic layer and the word layer on the basis of the traditional LDA model,which is mainly to avoid the influence of the sentiment factors on the division of the topic in the traditional JST(Joint Sentiment Topic)model,and introduce a Dirichlet distribution between topic and sentiment.In the aspect of topic evolution analysis,we regard the text time information as the attribute of topic according to the existing time tag,and the introduction the Beta distribution between topic and time,to analyze the evolution of the topic.The results of the final experiment show that the JTSoT model proposed in this paper can directly reflect the relationship between topic and emotion and the evolution trend of topic,compared with the existing work TOT(Topic over Time)model and the eToT(emotion Topic over Time)model.At the same time,the JTSoT model has better results in model perplexity and text sentiment classification.
Keywords/Search Tags:topic model, sentiment classification, SVM, topic-sentiment, topic evolution
PDF Full Text Request
Related items