Font Size: a A A

Sentiment Classification With Semi-supervised Ensemble Learning

Posted on:2016-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:W GaoFull Text:PDF
GTID:2308330464453278Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Subjective texts on the Internet are undergoing a rather rapid expansion with the development of WWW. These texts have a great value in use. How to deal with the huge amounts of information automatically becomes an important research issue, which makes the appearing of the study of sentiment analysis. In sentiment analysis, sentiment classification is a basic task.Existing studies mainly focus on the supervised sentiment classification. However,supervised sentiment classification needs a contain scale of labeled data, which may consume large amounts of resources. Therefore, semi-supervised sentiment classification get more and more attention as it can use a bound of unlabeled samples together with a small amount of labeled samples. The research of semi-supervised ensemble learning in sentiment classification is very rare. In this paper, we mainly focus on semi-supervised ensemble learning methods in sentiment classification. In details, our study includes the following three aspects:First, this paper proposes a novel self-training approach to semi-supervised sentiment classification with feature subspace. The main idea is to train multiple classifiers with feature subspace and use maximum confidence ensemble method to make the classification decision. This method can avoids the bad influence caused by the noise features.Experimental studies demonstrate that our proposed approach significantly outperforms traditional self-training and feature subspace based co-training approach for semi-supervised sentiment classification.Second, this paper proposes a novel ensemble learning approach to semi-supervised sentiment classification based on label consistency. The main idea is to automatically annotate the unlabeled samples which get the same label from multiple classifiers and thenadd them to labeled sample set. It filters the unlabeled samples with different labels and ensure the quality of the labeled sample set, so as to avoid the bad influence from the pseudo labeled samples. Experimental studies demonstrate that our approach is capable of effectively reducing the error of the pseudo labeled samples and thus achieves much better performances than the simple semi-supervised sentiment classification approach.Third, we propose a novel ensemble learning approach to semi-supervised sentiment classification based on meta-learning and sample filtering. Compared to the ensemble learning approach based on label consistency, it has more extensive adaptability, and achieve better performance when combining multiple semi-supervised learning methods.The main idea is to train a meta classifier from multiple semi-supervised methods and use the meta-classifier to classify unlabeled samples, then filter low-confident unlabeled samples. Experimental studies demonstrate the effectiveness of our approach and it achieves best performance when combining more semi-supervised learning methods.
Keywords/Search Tags:Sentiment Classification, Semi-supervised Learning, Ensemble Learning, Meta Learning, Sample Filtering
PDF Full Text Request
Related items