Genetic Algorithm Based Composite Kernel Partial Least Square In Disease Prediction And Classification With Genomic Data

Posted on:2017-01-01

Degree:Doctor

Type:Dissertation

Country:China

Candidate:H T Yang

Full Text:PDF

GTID:1224330503963230

Subject:Epidemiology and Health Statistics

Abstract/Summary:

PDF Full Text Request

Objective:With the advancement of biotechniques, vast amount of genomic data are generated with no limit. Prediction and classificaiton based on these data offers a cost-effective and time-efficient way for early disease screening. However, relationship between genes and a trait may be very complex and the conversion from gene to phenotype is not a simple function of individual genes, but involves the complex interactions of many genes, which should be considered a nonlinear mapping problem. In this contex, it is very important to develop powerful and efficient statistical models that can capture any potential nonlinear relathionships. In this dissertation, we develop a kernel partial least square based prediction method via combining multiple genomic data sources to provide improved information for better performance of prediction and classification. The proposed method is expected to have better learning capacity and generalization ability.Methods:Firstly, we construct a classical kernel partial least square model, then we calculate a new composite kernel function via a convex combination of multiple kernel functions. Finally we replace the previous kernel function in the classical kernel partial least square model with the new composite kernel function. All the parameters in the composite kernel partial least square model are optimized using genetic algorithm. By constructing an appropriate composite kernel function, our method can be used to deal with the prediction or classification problem of single genomic data source or multiple genomic data sources. The performance of our method is demonstrated by simulations and real data analysis. Results:The extensive simulation studies and real data analysis show that our proposed genetic algorithm based composite kernel partial least Square model has the largest 21 FQ and the smallest RMSEP compared to its counterparts, when predicting a quantitative trait using single genomic data. It also has the largest Youden index values and the smallest classification error when predicting triple negative vs non-triple negative breast cancer patients using three genomic data sources, i.e., microRNA, mRNA and CNVs obtained from TCGA websiteã€‚ Conclusion:We proposed a composite kernel approach based on the KPLS prediction framework. The composite kernel has good learning capacity as well as generalization ability.We proposed a composite kernel approach based on the KPLS classification framework. The composite kernel can fuse efficiently multiple genomic data source and obtain improved performance. Genetic algorithm can be used to solve the optimization problem of kernel parameter and kernel weight.

Keywords/Search Tags:

Kernel fusion, Kernel partial least squares, Genetic algorithm, Nonlinear prediction, Quantitative trait prediction

PDF Full Text Request

Related items

1	Prediction Of Potential Association Between LncRNA And Disease Based On Similarity Kernel Fusion
2	Research On Survival Prediction Of Breast Cancer Based On SVM
3	Cancer Grade Prediction And Pathway Analysis Based On Improved Multiple Kernel Learning Algorithm
4	Quantitative Prediction Of Drug Dissociation Rate Constants(K_off)
5	Cloning And Functional Analysis Of ZmRPC2 Regulating Kernel Development In Maize
6	Prediction Of MiRNA Disease Potential Association Based On Q Kernel Similarity And Matrix Completion
7	Research And Application Of FCM Based Multi-Kernel Support Vector Machine
8	Breast Cancer Subtypes Prediction Based On Improved Multiple Kernel Learning Algorithm
9	Model Fit Based On Genetic Algorithm And Its Medical Applications
10	Kernel Principal Component Regression Method On Feature Extraction And Prediction Also Its Application In Medicine