Font Size: a A A

Review Mining Based On Topic Model

Posted on:2015-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:G C LiuFull Text:PDF
GTID:2298330452953177Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of Internet technology, provides us with many channels ofonline communication and expressing our views, such as microblog, online media,online shopping platform,through these channels, a large number of user commentscome out, and these reviews contain users’ view points on certain things or viewpoints on product performance. By checking the above content, people can learnhelpful information to make a very valuable decision with high amount of information.However, how to dig out valuable information from a massive user comments is notan easy thing.Review mining provides a viable path for the this problem.Review mining is a popular research direction relating to natural languageprocessing, machine learning, data mining.It is widely applied in public opinionanalysis, Internet online advertising, recommendation systems and other fields. Inthis article, we will gives a detailed review about the product review mining, whichexhibits a relatively large commercial value, especially review mining usingtopic-based model.We will talk about two main tasks: first, identifying the property of every reviewsentence, then clustering the reviews by product property; second, doing sentimentanalysis for every review snippet, namely doing user sentiment polarity identification.The case will not be confused, in this article, we call the two tasks referred to as"clustering by property","sentiment analysis" respectively for short. The method usedin this paper is the probability topic model, which models text using Bayesian theory,and is a generative model. Using topic model, we can find hidden topics behind thetext. In this paper, the general concept of probability topic models and inferencemethods used by model inference, such as Bayesian theory, variational inference,will be introduced. The model used in this paper is CMA, which will be given adetailed study. Then we use the CMA to model hand-collected review data in Chinese,and give experimental analysis on review corpus clustering by property and sentimentanalysis.And two comparative analysis will be given respectively: a baseline whichdoes sentiment analysis using the maximum entropy model and another baselinewhich does review clustering by property using hierarchical clustering with cosinesimilarity. Our experiments show that the CMA model can also be successfullyapplied to Chinese review mining, and once again proved the advantages of topic model, but because of other factors such Chinese words segmentation and speechtags, we don’t achieve the same performance as that experiments about Englishcorpus.Finally, the author describes the experiences in this research, and possible futureresearch is discussed.
Keywords/Search Tags:review mining, topic model, property identification, clustering, sentiment analysis
PDF Full Text Request
Related items