Font Size: a A A

Research On Sentiment Analysis With Product Review

Posted on:2012-02-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:F T LiFull Text:PDF
GTID:1118330362967967Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Web2.0, people are more likely to express their opinionsor hands-on experiences on any object in Internet. The opinion information is importantfor national government, business organizations and personal costumers. Product review,which is posted in the Electronic Commerce site, is one of the most used opinion sources.In this paper, we analyze the tasks of sentiment classification, opinion retrieval, opinionextraction and opinion spam based on the product reviews.We make the following contributions:(1) For Sentiment Classification, we put forward a joint sentiment and topic mod-el, i.e. Dependency-Sentiment-LDA for two-way sentiment classification. Dependency-Sentiment-LDA not only analyzes the global topic and sentiment in a unified way, but alsoemploys the local dependency among sentiments. To solve the multi-category sentimentclassification, we further propose a novel learning framework to incorporate reviewer andproduct information into the text based learner for rating prediction. The reviewer, prod-uct and text feature are modeled as a three-dimension tensor. The tensor factorizationtechnique is employed to reduce the sparsity and complexity problems.(2) For opinion retrieval, we propose two graph based Opinion Retrieval Methods,Opinion PageRank model and an Opinion HITS model. Two methods can incorporateboth the topic relevance information and the opinion sentiment information into the doc-ument ranking procedure. Meanwhile, the models could naturally consider the relation-ships between different answer candidates, and select the opinion information mentionedby many documents.(3) For opinion extraction, we propose a new machine learning framework based onConditional Random Fields (CRF) for topic and sentiment word extraction. It can employrich features to jointly extract positive opinions, negative opinions and object features forreview sentences. The linguistic structure can be naturally integrated into model repre-sentation. Besides linear-chain structure, we also investigate conjunction structure andsyntactic tree structure in this framework. Since the labeled data sometimes is hard toacquire, we further propose a domain adaptation framework for lexicon extraction whenwe do not have any labeled data in a domain of interest, but have labeled data in a related domain. Our proposed framework consists of two steps. In the first step, we generatesome sentiment and topic seeds in the target domain. In the second step, we proposea Relational Adaptive bootstraPping (RAP) algorithm to expand the seeds in the targetdomain by exploiting source domain labeled data and relationships between topic andsentiment words.(4) For opinion spam, we exploit the semi-supervised methods to identify reviewspam. We observe that the review spammer consistently writes spam. This provides ustwo views to identify review spam: the first is to identify review spam based on reviewrelated features; the second is to identify if the author of the review is spammer. Weprovide a two-view semi-supervised method, co-training, to exploit the large amount ofunlabeled data. The experiments show that the two-view co-training method achieves abetter result than the single-view methods for review spam identification.
Keywords/Search Tags:Sentiment Analysis, Opinion Mining, Sentiment Classification, OpinionRetrieval, Opinion Extraction
PDF Full Text Request
Related items