Font Size: a A A

A Test Sentiment Classification Algorithm Based On Improved Support Vector Machines

Posted on:2012-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:D Y TianFull Text:PDF
GTID:2218330338499462Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Text sentiment classification is meaningful with some applications such as public opinion analysis, opinion polls and so on. Support vector machines have been widely used in text sentiment classification. Kernel function is the core of the SVM. Test point of the traditional Gaussian kernel function is space intensive in the low-dimensional, but quite sparse in high dimensional space, which resulted in generalization of a classifier is not strong. Meanwhile, parameters of traditional Gaussian kernel function are too small, fine-tuning effects of generalization ability are weak. In addition, learning and generalization ability of SVM also depend on the type of kernel functions. The traditional Gaussian kernel function is localized kernel function, learning ability is strong but generalization ability is weak. Currently, we often combine the traditional Gaussian kernel function and polynomial kernel function, but this method is more dependent on the characteristics of the data set itself, prone to data skew phenomenon.Classifier parameter selection is also an important factor affect the classification results in the text sentiment classification algorithm. Currently, cross validation algorithm, grid search algorithm, genetic algorithm are the parameters optimization algorithms which have some shortcomings such as optimal partition problem, computational complexity, slow, easy to fall into local optimization and other issues.For these problems, we work as followings:First, we make some improvements on the traditional Gaussian kernel function, test point of improved Gaussian kernel function in have a faster decay in the vicinity and in the distance. It solves the No-Flat problem of data set in the high dimensional space and improves the SVM generalization ability. Experiments show that the improved Gaussian kernel function of support vector machines, as opposed to the support vector machine based traditional Gaussian kernel function, the average accuracy rate of the macro, the macro average recall, micro-average precision, micro-average recall rates, with 1.76%,1.19%,0.72%,2.17% of the increase respectively.Second, we combine the improved Gaussian kernel function, sigmoid kernel function and polynomial kernel function as a new kernel function.There are different peaks and troughs in one test point and different zones. Peaks and troughs are narrow and they have slow decay in the region away from the test point. They improve the adaptability of the data set. And learning ability, generalization ability also has been improved. Experiments show that the weighted combination of the improved kernel function support vector machine, kernel functions,as opposed to the traditional combination of support vector machines, the average accuracy rate of the macro, the macro average recall, micro-average precision, micro-average recall rates, with 2.30%,1.41%, 2.01%,2.54% of the increase respectively.Third, we option the parameters of the multi-core support vector machine based on the improved Gaussian kernel function, which can automatically find the parameters of support vector machines, avoid the blindness of manual debugging parameters,saving the time of classification, and improve the classification accuracy. Experimental results show that set the parameters relative to the artificial method, cross validation, grid search algorithm, genetic algorithm, the F1 values, with 8.95%,1.96%,2.56%,0.57% of the increase, respectively.Fourth, we applied improved SVM in text sentiment classification and compared with the related algorithms of text sentiment classification in the related documents. Experimental results show that F1 value of text sentiment classification based on improved SVM increased by 9.73%,8.81%,10.89%.5.01%,2.92%.7.72%.5.67% respectively.We proposed the text sentiment classification algorithm based on multi-core support vector machine of improved Gaussian kernel function and particle swarm optimization. Experiments show that the algorithm had different levels of increase and had good application value.
Keywords/Search Tags:Improved SVM, Text sentiment classification, Multi-core functions, Automatic optimization of parameter, Performance evaluation
PDF Full Text Request
Related items