Font Size: a A A

Research On Fuzzy Support Vector Machine Algorithm For Class Imbalance Learning

Posted on:2015-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:C X GaoFull Text:PDF
GTID:2298330422987418Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Support vector machine is a popular machine learning techniques now.It cansolve the practical problems of small samples, nonlinear and local minimize, etc.It canefficiently solve classification problems of balanced datasets.However, SVMproduces suboptimal classification models for imbalanced datasets.And SVMalgorithm is sensitive to isolated datas and random noises.Despite the existing classimbalance learning methods can weaken class imbalance problem, but still face theproblem of random noises and isolated samples. And SVM learning process needs toadjust a number of parameters. So it takes a long time to learn the SVM model.For these issues this thesis proposes fuzzy clustering methods for imbalanceddatasets firstly. If the imbalanced degree of the sample set is not too large, possibilisticfuzzy c-means clustering based on kernel algorithm KPFCM is effective for clusteringof imbalanced datasets. With typical values coordinating with fuzzy membershipvalues, the algorithm can improve its robustness to imbalance and noises. Meanwhile,a Gaussian kernel parameter optimization method is presented to choose parametersfor kernel clustering. For the problem of fuzzy clustering centers offsetting seriouslyin the case of existing a larger imbalanced proportion, this thesis combinesoversampling technique with KFCM fuzzy clustering algorithm to handle imbalanceddatasets clustering.Secondly, based on fuzzy clustering method research for imbalanced dataset, animbalanced fuzzy support vector machine classification algorithm FPSVM-CIL basedon kernel clustering is put forward to achieve imbalanced classification in thepresence of random noises and isolated samples. At first, by setting fuzzy membershipand typical values threshold, the algorithm achieves to reduce the imbalancedproportion of sample set. Then fuzzy values obtained by the kernel clustering arecombined linearly with different penalty coefficient for class imbalance learning.Finally the combination is introduced into the fuzzy support vector machine model asfuzzy membership values. Experimental results on artificial datasets and real datasetsshow that the FPSVM-CIL algorithm has good classification performance forimbalanced datasets and also has strong robustness to the random noises.Finally, on the basis of support vector machine model and extreme learningmachine model analysis, this thesis proposes an imbalanced fuzzy support vectormachine approximate method ELM-CIL. The ELM-CIL method takes advantage of extreme learning machine with faster learning speedy. The method improves thetraditional extreme learning machine model by introducing different fuzzy values andpenalty coefficient into the model according to the sample distribution. Experimentalresults show that the ELM-CIL algorithm guarantees minority class classificationaccuracy is equivalent to SVM algorithm and speeds up the learning speed. Thealgorithm is particularly suitable for handling large-scale imbalanced sample setsclassification problems.
Keywords/Search Tags:support vector machine, kernel clustering, class imbalance learning, possibilistic fuzzy, extreme learning machine
PDF Full Text Request
Related items