Font Size: a A A

Research On The Classification Algorithm Based On Density Cropping SVM Combined With AdaBoost-KNN

Posted on:2020-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z J FanFull Text:PDF
GTID:2438330590462466Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of big data,people are facing with the problem of how to get the valuable knowledge that people want from the massive data.Under this demand,data mining has risen and become a hot research topic.Data classification is an important part of data mining.By using efficient classification algorithm to analyze the data and find out the existing links,the categories of samples to be tested can be predicted as accurately as possible.Among them,SVM algorithm has become a popular data classification method because it can solve the problems such as non-linear,high-dimensional,over-learning and so on.However,in the training stage,the calculation amount of traditional SVM algorithm is proportional to the number of samples.Therefore,when dealing with massive data,the calculation amount is huge and takes too much time.The noise in training samples can affect the hyperplane of classification and reduce the classification accuracy of the algorithm.In the classification stage,the testing sample near the hyperplane is prone to misclassification.An improved algorithm is proposed to solve these problems in SVM classification algorithm.Firstly,the original learning data set was pre-processed according to the density,and the processed data set was taken as the new learning data set.SVM parameter optimization was carried out on the new learning data set,and the optimal classification decision function was obtained through learning.Then,in the test stage,the sample to be tested is put into the decision function.If the value of the function is greater than the set threshold,the category predicted by SVM will be used as the category of the sample;otherwise,the AdaBoost.M1-KNN algorithm was used to re-predict.In the pre-processing stage,a large number of redundant learning samples,including most of the noise samples,are effectively cut off,so that the computational load in the learning stage is effectively reduced,and the classification hyperplane is more accurately partitioned,thus improving the classification accuracy of SVM.For the case that the samples that appear near the hyperplane of SVM classification can not be correctly classified,the AdaBoost.M1-KNN classification algorithm is introduced to further improve the classification accuracy.Finally,we apply this algorithm to ten data sets of UCI.The experimental results show that the proposed classification algorithm effectively improve the classification accuracy than the traditional SVM,and reducing the learning time of SVM.Therefore,the scheme is feasible and effective.
Keywords/Search Tags:SVM, Density cutting, Adaboost, KNN
PDF Full Text Request
Related items