Font Size: a A A

Support Vector Machine Based Classification Algorithms Research For Imbalanced Data

Posted on:2018-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:D Q LiuFull Text:PDF
GTID:2348330512477310Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Support Vector Machine(SVM)is a kind of machine learning algorithm which is based on the statistical learning theory,and it has solid theoretical basis and feature good classification effect.However,for the imbalanced data,the hyperplane trained by SVM is biased towards the minority class,which results in a poor classification performance for minority class.In real world,many classification problems are presented with high imbalance.So in this article,an improved SVM algorithm is proposed to solve the problem that classifier has a poor performance on minority class.The main content of this article include:1.Introduce the related background of statistical learning theory,and expound the basic principles and realization process of SVM algorithm.2.Based on theoretical and experimental analysis,conclude five factors of imbalanced data which affect the classification performance of SVM,and then analyze two typical improved SVM methods based on research status at home and abroad,the performance of each algorithm is compared by experiment.Finally,conclude the individual shortcomings of two improved SVM methods.3.Based on existing SVM algorithms,a hybrid SVM algorithm HSVM is proposed.By combining adaptive synthetic sampling(ADASYN)algorithm with different error cost(DEC)algorithm,it can overcome the limitation of using the single improved algorithm,and improve the bias of hyperplane caused by imbalanced datasets.4.According to the negative effect caused by within class imbalance,a new correction algorithm is proposed to correct prediction model.Artificially pick out the misclassified minority class subconcepts,and add them to prediction model.When the input sample is located within these minority class subconcepts,correct the prediction value to predict it as minority class as far as possible,thus improving the prediction model's adaptability to different data characteristics.
Keywords/Search Tags:Imbalanced dataset, Support Vector Machine, Adaptive Synthetic sampling, Different Error Cost, Correction algorithm
PDF Full Text Request
Related items