Font Size: a A A

Research On Chinese Text Sentiment Polarity Classification

Posted on:2013-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:S T DengFull Text:PDF
GTID:2248330362474874Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the continuous development of the Internet, the users and their commentshave the explosive growth. These comments include a large amount of information, Thecompany needs the feedback of its products or services to sale its products more, andthe government also need to know the majority feedback of the people to make betterdecision and so on. How to handle the information to get the desired knowledge, inrecent years has been widely attended and researched. Sentiment classification is a newresearch field can be applied to information filtering, product recommendations, anduser interest to explore and so on.Sentiment classification results are generally divided into positive and negativetwo kinds. Machine learning method, semantic computation and both of them are themethods, this article takes the last one. It calculated the polarity of the sentiment wordby the sentiment comments corpus, and then use the improved PMI statistical methodsto expand existing sentiment lexicon, get the features from the document frequency,chi-square statistics and Combination of both to build a initial Naive Bayesian classifier,then build the weighted naive Bayesian classifier with the expanded sentiment lexiconto improve the classification effect. Since Single classifier performance improvementsalso encounter a problem, it is difficult to be significantly improved and to adapt for allsituations. In this paper, the multiple classifiers have replaced the best classifier toachieve the best solution of current problems. on the training data set constructing theweighted naive Bayesian, decision tree and KNN classifier, then use back substitution toget their classification accuracy to build the weighted voting method (the weightedvalue: the classification accuracy on the train set) to predict the class of the test instance,then ultimately improve the classification results and to adapt to a variety of corpusability.
Keywords/Search Tags:Chinese text sentiment classification, sentiment lexicon, Naive Bayesian, Attribute weight, Multiple Classifiers
PDF Full Text Request
Related items