A Short Text Classification Model Based On Combination Of AdaBoost And SVM

Posted on:2017-12-10

Degree:Master

Type:Thesis

Country:China

Candidate:J H Jia

Full Text:PDF

GTID:2428330596457387

Subject:Engineering

Abstract/Summary:

Due to the emergence of social networks,people need to face a large number short text information in their work and life.How to obtain the necessary content fastly and precisely has become an urgent problem to be solved.Conventional text categorization can solve these problems to a certain degree,but there are still some disadvantages such as poor classification effect,poor generalization performance and so on.Because the content of short text is small,high-dimensional and sparse features,the traditional classification model can't be good at solving these problems.Therefore,it is necessary to study the short text classification technology deeply.To a certain certain extent,the short text classification model proposed in this paper solves the shortcoming of traditional classification algorithms are not suitable for short text classification.In this paper,the main work is as follows:In the first place,because classification performance and generalization ability of support vector machine in multi classification data sets is not high.Through the introduction of AdaBoost algorithm and the PSO algorithm,this paper puts forward a classification algorithm(referred to as AdaBoost-PSOSVM).Firstly,SVM is optimized by PSO algorithm(referred to as PSOSVM),namely PSO selects the parameters of the support vector machine automatically.Secondly,we can get a strong classifier basing on integrating PSOSVM through AdaBoost.Experimental results show that the strong classifier can improve classification performance to a certain extent,and significantly higher than the PSOSVM algorithm and SVM algorithm classification performance in the UCI data sets separately.In the second place,we propose a feature extraction algorithm(referred to as FEGAX)based on genetic algorithm andχ~2 statistics,because genetic algorithm andχ~2 statistics has the advantage of easy parallel processing and the preventing local optimal solution.It's basic idea is to use theχ~2 statistic to pre-selected the original text sets,and the second feature selection is performed by Genetic Algorithm.This feature extraction method can improve the classification quality of SVM to a certain extent is verified by the experiments.Last but not the least,this paper uses LDA thematic model to extend the features in order to solve the problem that short-text feature sparsity affect classification performance,after the short text pre-processing and FEGAX feature selection.It is the basic idea of using LDA theme model to train short texts firstly.We can get the corresponding topic distribution.After short text preprocessing and FEGAX feature selection,the corresponding topic words with the highest probabilities are added to the feature sets.Lastly,compared with the PSOSVM algorithm and the SVM algorithm,AdaBoost-PSOSVM algorithm is used to classify the short texts.The experimental results show that AdaBoost-PSOSVM classification performance is better than PSOSVM and SVM in these different short-text datasets.Therefore,we get a conclusion that AdaBoost-PSOSVM algorithm has good classification performance and high generalization ability.

Keywords/Search Tags:

Support Vector Machine, Adaptive boosting, Particle Swarm Optimization, Classification effect, LDA thematic model

Related items

1	Parameters Optimization Of Support Vector Machine Based On Improved Quantum Particle Swarm Optimization Algorithm
2	The Parameter Optimization Of Support Vector Machine Based On Improved Particle Swarm Optimization And Its Application
3	Research On Support Vector Machine Classification Method Based On Intelligent Optimization
4	Research On Interval Adaptive Particle Swarm Optimization And Its Application
5	Research On Unsupervised Clustering Algorithm And Support Vector Machine And Their Application
6	The Research Of Improved Particle Swarm Optimization Algorithm And Its Applications
7	The Particle Swarm Optimization And Research And Application Of The Support Vector Machine
8	Intelligent Group Optimization Algorithm PSO And Its Application In Several Types Models Optimization
9	Credit Scoring Model Based On Support Vector Machine And Particle Swarm Optimization
10	A Study On The Support Vector Machine Ensemble Learning Mehtod Based On Particle Swarm Optimization