Font Size: a A A

Research Of Text Sentiment Classification Based On Incremental SVM

Posted on:2017-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiFull Text:PDF
GTID:2428330590491610Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Text sentiment classification is an important part of text classification,which is also the research hotspot of text classification field.It filters out the text's subjective content and then does sentiment analysis,thus identifying the sentiment category that the text belongs to.In the current big data era,the amount of user-generated text content on the Internet continues to grow,and accumulates very fast.So,effectively and quickly identifying the emotional tendency of the massive new texts has become a very urgent need.Sentiment identification of these texts on the Internet has great significance.By doing so,we can not only recommend goods in e-commerce field,but also help the government carry out supervision of public opinion.Currently the main method of text sentiment classification is based on machine learning techniques.Support vector machine(SVM)algorithm is a classical machine learning algorithm with a relatively good performance and has been widely used in different classification areas.However,the traditional SVM classification algorithm has no incremental learning ability,but the samples are constantly accumulated in the real world.In order to reduce the re-training time and adapt to accurately classifying vast amounts of data,this paper improves the traditional SVM algorithm from the perspective of incremental learning,and applies SVM incremental learning algorithm to text sentiment classification field.After the study of basic characteristics of SVM algorithm and associated incremental learning algorithms,this paper proposes a new incremental learning algorithm for SVM based on a combination of reserved set——CRS-ISVM(Combined Reserved Set ISVM).Firstly,in order to build the reserved set,a new method of selecting samples ——Zoom Pan selecting method is proposed,to make up for deficiencies when doing sample selection.In addition,the idea of combining reserved set is adopted,that is,not only building the reserved set for original training set,but also for incremental samples.Some of the non-SV samples in the original training set and a part of the samples that satisfied the KKT conditions in the incremental set are added to the reserved set,and meanwhile sample weights are given.The original SV set,the incremental samples that violate to the original KKT conditions,and a small number of samples selected from the reserved set according to the weight,are used to build the new training set for next incremental learning.Thus original sample knowledge is retained and new sample knowledge is learned.Experimental results show that the new incremental learning algorithm for SVM can speed up the classification process and improve classification accuracy.Finally,incremental SVM algorithm is applied to the field of text sentiment classification.This paper builds an incremental SVM learning system for sentiment classification,in order to adapt to the processing requirements of the continuously accumulated texts.Different text features are extracted and experiments show that the text sentiment classification system based on ISVM can effectively reduce the storage of historical data,speed up the classification process,while ensuring the classification accuracy.
Keywords/Search Tags:support vector machine, incremental learning, reserved set, text classification, sentiment analysis
PDF Full Text Request
Related items