Font Size: a A A

Research On Method And Application Of Fuzzy Support Vector Machine With Feature Selection

Posted on:2022-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2518306542472054Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Support Vector Machine(SVM),developed from Statistical Learning Theory(SLT),is a new method that uses optimized learning methods to deal with machine learning(ML)problems.This method performs very well in the classification prediction of small sample and high dimension datasets,effectively resolves the difficulties of traditional classification prediction algorithms,and has excellent generalization performance.As a robust classification regression algorithm,SVM algorithm currently plays an important role in pattern recognition,text classification,image classification,bioinformatics,handwriting character recognition,face detection,generalized prediction control and other fields.This paper mainly learns the basic theory of SVM algorithm,and on this basis,combines the kernel function,fuzzy membership function,cost-sensitive learning method and sparse learning to improve the SVM model and apply it to practical problems.The main research work of this paper is as follows:(1)In order to solve the problem that the SVM algorithm is sensitive to the classification of imbalanced data sets,a Fuzzy Linear Programming Support Vector Classifier using Kernel,Penalty factors and Feature Selection(KP-FLPSVC-FS)model to solve the classification problem of imbalanced data sets is proposed.The model uses a cost-sensitive learning method and introduces a class imbalance penalty factor in the SVM model,which effectively reduces the impact of class imbalance on classification and improves the classification accuracy of the model;In addition,the model also proposes a reconstructed fuzzy kernel matrix,which combines the mean fuzzy membership function with the reconstructed kernel function.The reconstructed fuzzy kernel matrix effectively reduces the influence of outliers such as noise and outliers on classification,and enhances the robustness of the model.The effectiveness of the KP-FLPSVC-FS model was verified through experiments on the bioassay data set of drug discovery.(2)In order to further improve the model's ability to reduce redundant features,this paper proposes an Improved Trapezoidal Fuzzy Nonlinear Optimization Support Vector Classifier with Feature Selection(ITF-NOSVC-FS).The model extends the function of the SVM algorithm,not only improves the overall performance of the classification of noisy data sets,but also enhances the interpretability of the model.Firstly,the model improves the fuzzy membership function of the standard trapezoid,changing the hypotenuse of the standard trapezoid from a straight line to a broken line to better fit the data distribution.By calculating the corresponding fuzzy membership value of the sample points,the sample points are divided into noise points and normal points to achieve the purpose of removing noise points and abnormal points;At the same time,the contribution or importance of each feature to the classification is obtained by using the 1-norm regularization of the weight vector,which makes the solution of the model more sparse and improves the interpretability of the model;In addition,the model uses the 2-norm regularization of the error vector,increases the penalty term in the model,and improves the reduction performance of the model.Through experiments on actual data sets in the field of systems engineering,the experimental results show that the ITF-NOSVC-FS model has higher classification accuracy and wider adaptability.(3)Based on the research of(1)and(2),in order to make the ITF-NOSVC-FS model proposed in(2)better improve the classification accuracy of the class imbalanced data set,we introduce the cost-sensitive penalty factor method in(1)into the model proposed in(2),proposed a Nonlinear Optimization Support Vector Classifier using Improved Trapezoidal Fuzzification,Penalty factors and Feature Selection(ITFP-NOSVC-FS).At the same time,in order to improve the anti-noise ability of the KP-FLPSVC-FS proposed in(1),an improved trapezoidal fuzzy membership function is introduced into the model proposed in(1),and an Improved Trapezoidal Fuzzy Linear Programming Support Vector Classifier using Kernel,Penalty factors and Feature Selection(KP-ITFLPSVC-FS)is proposed.The ITFP-NOSVC-FS and KP-ITFLPSVC-FS models are applied to the bioassay data set of drug discovery.The experimental results show that the improved models have achieved good classification results.Finally,the four models proposed in(1),(2)and(3)were comprehensively compared on the bioassay data set of drug discovery.Finally,the four types of classifiers proposed in(1),(2)and(3)are compared and analyzed in many aspects based on the improved SVM method.The analysis results show that the ITFP-NOSVC-FS model with improved trapezoidal fuzzy membership function and penalty factor has good classification performance.
Keywords/Search Tags:Support vector machine, imbalanced data set, fuzzy membership function, feature selection, kernel function
PDF Full Text Request
Related items