Font Size: a A A

Interpretable Identification Framework Construction And Prognostic Risk Assessment For Mild Cognitive Impairment Subtypes

Posted on:2024-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:F L YiFull Text:PDF
GTID:2544307148981529Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective:Mild cognitive impairment(MCI)is commonly considered as a precursor stage to dementia.Based on the different areas of cognitive impairment,MCI can be divided into different subtypes,including amnestic MCI(a MCI)and non-amnestic mild cognitive impairment(na MCI).The progression endpoints of different subtypes of MCI may vary,and accurately identifying MCI subtypes and quantifying the risk of MCI progression is important for developing personalized interventions and promoting MCI research.Given the difficulties in manual diagnosis,this study proposes two MCI subtypes identification and interpretation strategies to improve the efficiency and transparency of MCI subtype identification models.Considering the fact that MCI progression has multiple endpoints,this study uses a competing risk model to estimate the unbiased prognosis risk of MCI progression to AD,and establishes a personalized multi-time point prediction mechanism to reveal the potential factors of MCI progression to AD and promote precise treatment for MCI patients.Methods:The data was obtained from the National Alzheimer’s Coordinating Center(NACC)database.The study included MCI patients who were first diagnosed between November2005 and April 2018 and were divided into two categories: amnestic MCI(a MCI)and nonamnestic MCI(na MCI).The study included patient demographics,neuropsychological tests,and structural magnetic resonance imaging(s MRI)as features.The study used several feature selection methods,including maximum relevance minimum redundancy,elastic net,and Boruta,to reduce the dimensionality of the neuropsychological test and s MRI features.The intersection of the features selected by the three methods was used for modeling.The study used multiblock sparse partial least square-discriminant analysis(Multiblock s PLSDA)and six popular machine learning(ML)algorithms,including Naive Bayes,Support Vector Machine,Bootstrap aggregating,Random Forest,Adaptive Boosting,and Extreme Gradient Boosting,to identify a MCI and na MCI.The study also used SHapley Additive ex Planations(SHAP),Local Interpretable Model-Agnostic Explanations(LIME),and model-agnostic language for exploration and explanation(DALEX)to explain the models’ decisions for individual patients(such as patient 28 and patient 34),and analyze the effects of important features on the target outcome direction and magnitude,achieving global interpretation.The study defined the endpoint event of interest as MCI progression to AD within five years and the competing event as MCI progression to other types of dementia.The study quantified the risk of MCI patients progressing to AD using a competing risk model and predicted the probability of AD progression at different time points for a given individual(such as patient 2)to evaluate the stability of the model’s multiple time point predictions.Results:The feature selection algorithm retained a total of 85 features,of which 24 were neurocognitive test features and 61 were s MRI features.Among the two identification strategies,XGBoost showed the best overall performance for MCI subtype identification(AUC=0.8837),followed by SVM(AUC=0.8347),Ada Boost(AUC=0.8351),RF(AUC=0.8308),Bagging(AUC=0.8072),and NB(AUC=0.7912).The Multiblock s PLS-DA algorithm had an overall performance(AUC=0.8419)slightly lower than XGBoost but higher than other ML models.For personalized interpretation,XGBoost-SHAP,XGBoost-LIME,and XGBoostDALEX showed consistent predictions of a MCI probability for subjects 28 and 34,which were 0.1 and 0.7,respectively.For subject 28,the seven common explanatory features identified by the three interpretation algorithms were CEREALL=1073.33,CDRSUM=0.00,ORIENT=0.00,RENT=4.24,CRAFTVRS=17.00,RLATOCCM=2.01,and RINSULAM=2.65,and the direction of the effect of each feature was consistent across the three algorithms.For subject 34,the five common explanatory features identified by the three interpretation algorithms were CDRSUM=0.00,TRAILB=67.00,CRAFTVRS=4.00,LPRECENM=1.95,and RINSULAM=2.49,and the direction of the effect of each feature was consistent across the three algorithms.For global interpretation,the five common features among the Multiblock s PLS-DA,XGBoost-SHAP,XGBoost-LIME,and XGBoost-DALEX algorithms were CEREALL,ORIENT,CRAFTVRS,CDRSUM,and RENT,and the direction of the effect of each feature was consistent across the four algorithms.Based on a competing risk model,increased scores in ORIENT and TAXES were identified as risk factors for MCI progression to AD,whereas increased scores in LOGIMEM and CRAFTURS and larger WMHVOL volume were identified as protective factors against MCI progression to AD.For subject 2,the AD progression probabilities at months 30,40,and 50 were 0.513,0.667,and 0.741,respectively.The AUCs of the model at months 30,40,and 50 were 0.724(0.654,0.794),0.724(0.653,0.795),and 0.719(0.646,0.792),respectively,and the calibration curves were also stable.Conclusion:This study used neurocognitive tests and s MRI features as the basis for data and proposed two MCI subtype identification and interpretability strategies,yielding the following conclusions:(1)XGBoost performed the best in MCI subtype identification,while Multiblock s PLS-DA achieved good identification results due to its strong feature sparsity,integration,classification,and interpretability capabilities;(2)XGBoost combined with SHAP,LIME,and DALEX explainable techniques showed consistent local interpretation results for given individuals;(3)Multiblock s PLS-DA and XGBoost-SHAP,XGBoostLIME,and XGBoost-DALEX retained more common features in the summary of global interpretation results.Furthermore,this study evaluated the prognosis risk of MCI patients through a competing risk model,and concluded that the analysis efficiency of the competing risk model was superior to that of traditional models,providing more accurate personalized predictions of individual risk at multiple time points,with smaller changes in the AUC and calibration curves,and good model stability.
Keywords/Search Tags:Alzheimer’s disease, Mild cognitive impairment, Subtypes identification, Risk Assessment
PDF Full Text Request
Related items