Font Size: a A A

Modeling And Optimization Of Gene Microarray Data Classification Based On Intelligent Optimization

Posted on:2019-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:X T GaoFull Text:PDF
GTID:2310330545993354Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of gene microarray technology,how to explore the value of gene microarray data for finding out disease-causing genes,genetic testing,the early detection and treatment of diseases,and the individual differences in disease gene expression,have become the most important research topics.Facing such typical high-dimensional small sample datasets of gene microarray,researchers can't do anything about dimensionality disaster,overfitting and local extremum with traditional machine learning methods.As an important achievement of statistical learning theory,support vector machine(SVM),which takes the structural risk minimization instead of the traditional empirical risk minimization criterion,avoids the above shortcomings.Based on two extensions of SVM:Least Squares Support Vector Machine(LSSVM)and Relevance Vector Machine(RVM),this paper studies the application of statistical learning theory in gene microarray.The work and the main contributions of the full paper are as follows:1)Analyzing the difference between feature extraction and feature selection in feature engineering.And then selecting the optimal feature subset by combining the filter and wrapper methods in the feature selection method according to the actual demand of finding pathogenic genes.The filter is used to find out the features with strong correlation with classification and the wrapper is used to sort the features.The results show that the method is effective,and the classifiers use fewer features to obtain higher classification accuracy.2)LSSVM transforms the quadratic programming problem in SVM into linear programming problem,which improves the computational efficiency.Because LSSVM penalty parameter and RBF kernel bandwidth need to be optimized,this paper combines PSO and FOA to optimize parameters.The results show that the proposed method outperforms the comparative literatures in terms of feature dimension and classification accuracy.3)RVM is a Bayesian expansion of SVM.Because the solutions of RVM are sparser,RVM is more suitable than SVM for on-line detection.However,the RVM only needs to optimize the hyper-parameters.In this paper,a simple DE algorithm is used to optimize it.In order to increase the population diversity,ACO is used to generate perturbations in the DE algorithm,which increases the probability of finding the global optimum.In the analysis of results,the method is also superior to the reference results.And further comparisons among proposed methods with other methods under multiple datasets indicate that the proposed methods are qualified to be popularized.
Keywords/Search Tags:gene microarray, least squares support vector machine, relevance vector machine, feature selection, parameter optimization
PDF Full Text Request
Related items