Font Size: a A A

Research On Image Genetics Data Mining Method For Alzheimer’s Disease And Its Application

Posted on:2023-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y P QiuFull Text:PDF
GTID:2544306839968279Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Alzheimer’s disease(AD),commonly known as dementia of the elderly,is a common neurodegenerative disease that causes cognitive decline in the elderly.AD is also characterized by high disability rate and high mortality rate.Therefore,early detection,early prevention and early treatment are the keys to the prevention and treatment of AD.The "Curse of dimensionality" of genome-wide high-dimensional data and the diversity of image genetic data have brought challenges to AD research.In recent years,the significant improvement of computer performance and the rise of machine learning and other related technologies promote the development of related research,which points out the direction for further mining the structural information between AD genes and images,analyzing the correlation between them,and revealing the pathogenesis of disease.To address the shortcomings of previous studies such as weak correlation,insufficient biological significance and low computational efficiency of overly complex models,this paper focuses on three aspects of imaging genetics based on data-driven methods such as machine learning,supplemented by regularization techniques: the impact of multi-locus variants on candidate brain regions;the impact of multi-locus variants on phenotypes;disease classification and diagnosis.The main work as well as the innovations of this paper are as follows:(1)This paper introduced the basic concept of imaging genetics,and outlined the current research status of imaging genetics at home and abroad as well as summarized the current mainstream imaging genetics analysis techniques.(2)An L0-based regularization model is proposed for genome-wide association analysis of genetic variants and phenotypes.The model consists of three parts: the first part is the empirical risk of the regression model,which is used to ensure that the predicted values are as close to the true values as possible;and the second part is the L0 regularization,and the regularization parameters are used to control the number of feature SNPs with non-zero coefficients,which plays the role of feature selecting;while the third part is the L2 regularization,which can prevent overfitting.The experimental results showed that the proposed method significantly outperformed the group sparse multitask regression model in terms of regression accuracy.Next,in the ADNI genome-wide analysis,after annotating the effect of all variants by Ensembl Variant Effect Predictor(VEP),method located 33 missense variants which can explain 40.1% phenotype variance.Then,each variant locus was mapped to the nearest gene and pathway enrichment analysis was performed.The Notch signaling pathway and Apoptosis pathway have been reported to be related to the formation of Alzheimer’s disease.(3)A diagnostic method for Alzheimer’s disease based on multi-objective optimization was proposed to explore the relationship between susceptibility genes and phenotypes and the diagnostic classification of AD.The model consists of three parts.Above all,VEP-based variant annotation,which annotates deleterious variant loci and maps them to the nearest protein-coding genes to enhance biological significance.Secondly,a multi-objective evaluation strategy based on entropy theory is applied to rank all candidate genes.Finally,XGBoost was applied to classify the imbalanced data consisting of 46 AD samples,483 MCI samples and 279 CN samples.The experimental results show that the proposed method not only has satisfactory classification performance compared with traditional classification models,but also found a significant correlation between AD and the known AD susceptibility gene RIN3.In addition,pathway enrichment analysis using the top 20 feature genes confirmed that three pathways were significantly associated with the formation of AD.
Keywords/Search Tags:Alzheimer’s disease, imaging genetics, machine learning, feature selection, regularization, association analysis
PDF Full Text Request
Related items