Font Size: a A A

Research On Cross-domain Sentiment Classification Of Reviews

Posted on:2015-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:X H WeiFull Text:PDF
GTID:2298330467485645Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rise of social network platforms, massive subjective information quickly emerges. Some research fields such as sentiment classification and opinion mining have attracted wide attentions owing to its potential applications. However, sentiment classification is domain-specific due to the divergent distributions. That is to say, the performance usually acts poorly when the distributions differ substantially. With the rapid development of information, new areas are emerging constantly. While collecting annotated data is time-consuming and expensive, so how to alleviate cross-domain sentiment classification is significative. It aims to generate a classifier with source domain data and to apply it in target domain effectively.In this paper, we focus on cross-domain sentiment classificatioa In general, the main works include the following three aspects:Firstly, we propose a weighted SimRank model for cross-domain sentiment classification. The model picks out the features that appear frequently in both domains with explicit sentimental polarity via ADMI in the first step. And then, a bipartite figure is constructed between pivot and non-pivot features. On the basis of the weighted SimRank algorithm, we build the Latent Feature Space (LFS) according to similarity. At last, each sample is re-weighted by the mapping function learned from the LFS. After reducing the mismatch of data distribution between domains, the algorithm performs well on cross-domain sentiment classification. The experiment verifies the effectiveness of the proposed algorithm.Secondly, considering the distribution variance of domain-independent and domain-specific features related to the subject of a sentence, we introduce a Subject-based ensemble model for cross-domain sentiment classification. The model splits the reviews into two views, namely personal view and object view for distinguishing. The personal view consists of sentences with explicit and inexplicit personal subject. The object view consists of the sentences with object subject According to statistics, the personal view is often domain-independent while the object view usually acts do main-specific. Based on this finding, we deduce that an efficient ensemble of views can enhance the performance of cross-domain sentiment classification, which may benefit more from the domain-independent part and overcome the drawbacks of domain-specific part.At last, Experimental results show that the proposed model is effective for cross-domain sentiment classification on both supervised and semi-supervised learning. Finally, considering the dependence of sentiment on review topic, we propose a comprehensive model, which takes sample filtering and Subject-based model into account simultaneously, named joint Sample Filtering with Subject-based Ensemble Model (SF-SE) based on above works. During the filtering stage, a sentence level Latent Dirichlet Allocation (LDA) model, which incorporates topic and sentiment together (SS-LDA) is introduced. Under this model, a high-quality training dataset is constructed in an unsupervised way by filtering the sentences that sentimental polarities are inconsistent with that of the belonging review. At last, experimental results demonstrate the effectiveness of SF-SE.
Keywords/Search Tags:Sentiment Classification, Cross-domain, Transfer Learning, Topic Model
PDF Full Text Request
Related items