Font Size: a A A

Cross Domain Sentiment Classification Based On Feature Correlation

Posted on:2015-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q OuFull Text:PDF
GTID:2308330473459323Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, a large number of network data are generated rapidly. Meantime, blog, product reviews and other information are also appeared. These informations exist in the form of text with certain sentiment orientation. Thus how to automatically identify the user’s emotions expressed by these texts (such as positive or negative) is playing an increasing important role in the social public opinion analysis and product evaluation.Cross-domain sentiment classification is to predict the polarity of unlabeled data from a new domain by correct classifier learning from existing information.However, the accuracy of sentiment classification tends to be influenced by different domains. A classifier trained in one domain is difficult to apply to the other domain. And at the same time, it is hard to obtain the labeled data in real life. For this reason, the study of traditional sentiment classification has faced a huge challenge. Therefore, research on cross domain sentiment classification has received extensive attention. Cross-domain sentiment classification is to predict the polarity of unlabeled data from a new domain by correct classifier learning from existing information.This dissertation conducts research on the existing problems in cross-domain sentiment classification. The main work is as follows:(1) First a general overview of cross domain sentiment classification, including its development background and significance, definition, the main research issues and classification, at the same time, research status are also presented.(2) In addition, a cross-domain sentiment classification algorithm, named APW algorithm, is proposed, in order to solve the target domain without labels problem for cross domain sentiment classification.The algorithm is based on constantly adjusting the polarity of features which appear in both source domain and target domain. The APW algorithm uses co-occurrence features as a bridge to propagate labels in source domain to target domain. The experimental results show that APW algorithm has great advantages in classification and time.(3) Finally, the TPW algorithm is proposed to solve the feature and polarity divergence issue. In this method, some domain independent words with higher polarities are first selected; then, as for the polarity divergence issue, the polarities of domain independent words are reset based on the contribution distance between source domain and target domain. As for the feature divergence issue, we transfer the polarities of all words to the target independent words through the domain independent words, which are regarded as the bridge. Thus, the information related with the polarity divergence words will be considered in the transferring processing. Experimental results show that the TPW algorithm has achieved better results in classification.
Keywords/Search Tags:cross-domain, sentiment classification, instance label, feature
PDF Full Text Request
Related items