Font Size: a A A

Text Sentiment Classification Based On Stacking Combination

Posted on:2018-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:C S YuanFull Text:PDF
GTID:2348330518482355Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of Web 2.0 has enhanced the participation of people in the social media, meanwhile bringing kinds of information with user views to researchers for investigation, which consists of both structured and unstructured information. The study of sentiment-based categorization on these resources available for research can be of great value and will promote public sentiment risk analysis, commodity sales and some other related applications technology. Text sentiment-based classification is comprised of subjective and objective information classification and subjective emotion polarity classification generally and this paper studies from the latter. At present, several state-of-the-art methods in the field of sentiment classification include support vector machine(SVM), belonging to traditional machine learning methods, and methods based on deep learning, which have gradually become hot spots in recent years. Ensembles of those methods will make full use of the advantages of them, beneficial to further improve their classification performance. A text sentiment classification model based on stacking combination, therefore, is constructed in this paper and the specific work is as follows:Firstly, the resources of open corpus for sentiment classification tasks are relatively scarce at present and additionally, users' expression on the Internet recently has become more novel and unique in the environment of Web 2.0. As solutions, some traditional corpus approved by researchers to some degree is gathered and some review corpus from third party review sites is collected in this paper. After that, relevant personnel are organized to validate the annotation of the combined corpus, the rationality of which is then demonstrated. That's how sample set for experiments in this paper is built.Secondly, SVM algorithm is superior to other traditional machine learning methods in the field of text sentiment classification owing to its special classification mechanism.Some modifications are made in this paper to the original SVM model to make it more adaptive. On the one hand, in view of the current situation that network buzzwords,emoticons, typos and other phenomena frequently appear on the Internet, emoticons are handled separately and then treated as common features, meanwhile emotion words and network buzzwords are collected to build a user dictionary guiding the model into word segmentation which is capable of improving its accuracy. On the other hand, feature selection and weighting are adjusted to some extent to optimize the efficiency of feature processing.Lastly, SVM performs well for its classification machanism though, it can hardly get rid of the inherent bottleneck of traditional machine learning methods. Convolutional Neural Network (CNN) is able to learn the local features of the text though, it can not find the relationship between the sequences. Also, Recurrent Neural Network (RNN) is able to build a well-performed linear model though, it can not extract features in a parallel manner.Based on which, those three methods are combined as base classifiers to construct an ensemble of them with stacking and as a result a sentiment classification model is built, in which word embedding is used to represent word vector and SVM is again used as its meta classifier. Finally, the proposed model is evaluated by the approach of voting and each of the base classifiers.
Keywords/Search Tags:sentiment classification, support vector machine, deep learning, stacking
PDF Full Text Request
Related items