Font Size: a A A

Research Of Sentiment Analysis Based On Topic Model

Posted on:2014-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:C R WuFull Text:PDF
GTID:2308330461472588Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Sentiment analysis is a novel research topic in the field of natural language processing, which focuses on analyzing the subjectivity information in text. Currently, much research work has been developed at home and abroad, and enabled various types of applications. However, the existing approaches mainly focus on detecting the overall sentiment or subjectivity from text and less considering the topical information, which make the results less informative to users. Topic models, on the other hand, are capable of discovering the hidden semantic thematic structure in large archives of documents. However, much work in topic models mainly focuses on detecting the topics while less considering the mixture of topics and sentiment. So we focus on the problem of mixing topics with sentiment, applying the probabilistic topic models to sentiment analysis. The details of the research are proposed as followed:(1) Previous works mainly separate the mining task in two steps:analyze topic-sentiment mixture, extract topic life cycles and sentiment dynamics. This paper focuses on the problem and proposes a novel probabilistic topic model called Topic-Sentiment Temporal Evolution (TSTE). The model combines the sentiment lexicon as prior knowledge, and considers topic, sentiment and temporal evolution simultaneously, builds Topic-Sentiment Temporal Evolution Model based on probabilistic topic model, extracts multiple topics and their sentiment temporal evolution. Experiments on Chinese weblog datasets show that the approach can effectively extract the topic facets and their sentiment temporal evolution at the same time.(2) In this paper we propose a novel probabilistic topic model called mixing topics and sentence subjectivity identification model, for detecting the subjective sentences in text collection. The model is based on current subjectivity detection model, which didn’t consider the effect of multi-topics in text collection. The proposed model views the subjectivity sentence identification as a weakly-supervised generative model, which only needs a small set of domain independent subjectivity lexicon to modify the Dirichlet prior of the subjectivity-topic- word distribution. The model has been evaluated on the dataset of Multi-Perspective Question Answering (MPQA) and the expected results have been obtained in the experiments. It is found that incorporating topics in sentence subjectivity identification is effective and can highly improve the sentence subjectivity identification recall and the F-value, and the extracted subjectivity topics are indeed coherent and semantically informative.(3) We combine the models from (1) and (2), and design a system framework of sentiment analysis based on topic models. The system framework assigns different datasets to different computers to process, and finally summarizes the results. The system can effectively reduce the operation load of single computer and improve the reliability, availability and expansibility.
Keywords/Search Tags:Sentiment Analysis, Topic Model, Topic Sentiment Evolution, Sentence Subjectivity Identification, Weakly-supervised Generative Model
PDF Full Text Request
Related items