Font Size: a A A

Research On SVM Instance Selection Algorithm Based On Evolutionary Multi-objective Optimization

Posted on:2021-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:J B ChenFull Text:PDF
GTID:2428330629980424Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Support vector machine(SVM),as a common and effective classifier in machine learning,has been successfully used in various classification learning,from pattern mining to computer vision,from medical diagnosis to information retrieval.Although SVM has a solid theoretical foundation and good generalization performance,it also has some shortcomings,one of which is that its training complexity is relatively high,and the complexity isO(n~2)evenO(n~3)(n is the number of instances in the training set).In today's society,as more and more data is available,this problem becomes more prominent.In order to solve this shortcoming,a data preprocessing technique and instance selection have been proposed.Instance selection is one of the important data preprocessing techniques in machine learning and data mining.Its main purpose is to select a subset from the original data set,and use this subset to learn a model with similar or higher accuracy than the original data set.At the same time,because only part of the training data is used,the training process of the model becomes more efficient.According to this,many SVM instance selection methods using different optimization techniques have been proposed.Among them,evolutionary computation as a global convergence algorithm,without any assumptions about the objective function and training data,has become a research hotspot for current instance selection.The existing instance selection algorithms based on evolutionary multi-objectives can obtain good classification accuracy or reduction rate.However,these algorithms either have higher accuracy but lower reduction rate(number of deleted instances),or have higher reduction rate and lower accuracy.Therefore,how to obtain a subset of instances with high reduction rate and high accuracy is very important.Based on this,this thesis proposes an evolutionary multi-objective optimization SVM instance selection algorithm(SDMOEA-TSS)based on subregion division to obtain an SVM training subset with both high accuracy and high reduction rate.On the other hand,when using evolutionary multi-objective to solve the SVM instance selection problem,because the evaluation of each individual needs to train an SVM classifier,and iteration needs to be repeated many times,this will make the algorithm computation efficiency relatively low.So how to design an efficient evolutionary instance selection algorithm without reducing the classification accuracy of the SVM instance subset is also an important challenge in current research.Therefore,this thesis proposes an efficient evolutionary multi-objective SVM instance selection algorithm(CSE-IS)based on clustering surrogate evaluation to reduce the actual evaluation of instances and improve the efficiency of the algorithm.Based on the two SVM instance selection algorithms proposed above.The main work of this article includes the following two parts:(1)This thesis proposes an evolutionary multi-objective optimization SVM instance selection algorithm(SDMOEA-TSS)based on subregion division.The main idea of the algorithm is to adaptively divide the subregions of the solution in the target space,design different cross-mutation operators in each subregion,and finally get a set of Pareto solutions.The algorithm mainly includes two strategies:1.Initialization strategy based on subregion division:This strategy is to generate the initial population by different selection probability,and then divide the initial population into the corresponding subregions through the target space;2.Evolutionary strategy based on subregion:Different evolutionary operators are designed for each different subregion,mainly three.it is a crossover operator based on subregions,a mutation operator based on subregions,and an adaptive update operator based on subregions.Compared with the existing SVM instance selection algorithm based on evolutionary computation,the SDMOEA-TSS algorithm itself has better convergence and diversity balance,and it has both better accuracy and higher reduction rate.(2)This thesis proposes an efficient evolutionary multi-objective SVM instance selection algorithm(CSE-IS)based on clustering surrogate evaluation.The main idea of the algorithm is to initialize population based the instance space clustering,and to encode and cluster individuals in the population in the evolutionary process.Using surrogate evaluation,only half of the offspring individuals are evaluated by SVM,and finally a set of Pareto solutions is obtained.The algorithm mainly includes two strategies:1.The initial population strategy based on instance-based spatial clustering:The strategy is to cluster the original instance data,select a small number of instances with a certain probability in each cluster,and iterate multiple times to construct the initial population.2.Surrogate evaluation strategy based on population individual code clustering:This strategy clusters the coding of each individual in the offspring,and obtains the number of parent(fitness function ranking)and offspring in each cluster,According to the ranking of the fitness value of the parent individuals,the fitness ranking of the current individual of the offspring is determined,and then the individual with the better fitness value in the first half is selected for SVM training evaluation.This algorithm can effectively reduce the number of real evaluations of the training set and speed up the algorithm search.At the same time,the experimental results show that the CSE-IS algorithm can not only greatly reduce the training time but also obtain a subset of instances with better performance.
Keywords/Search Tags:Instance selection, SVM, Evolutionary multi-objective optimization, Subregion division, Surrogate evaluation
PDF Full Text Request
Related items