| Part 1 Application of Autoencoder Based on U-Net Network in Diagnosis of Alzheimer’s DiseaseAim: Most of the current deep learning models used to diagnose Alzheimer’s disease(AD)rely on supervised learning.Supervised learning requires the collection of training manual markers,but for AD patients,supervised learning may affect the effectiveness of the model due to insufficient samples due to the difficulty of training.This study aims to build an Alzheimer’s disease diagnosis system model based on the Convolutional auto-encoder(CAE)and K-means clustering algorithm based on the previous work,which not only realizes the whole process of unsupervised learning,but also realizes the whole process of unsupervised learning.Moreover,it can be used to explore the difference between original and generated images in patients with AD.Method: In this study,patients who were admitted to the Memory Disorder Clinic of the First Affiliated Hospital of Zhejiang University School of Medicine from January 2017 to January 2020 were enrolled consecutively.They were presumed to have Alzheimer’s disease.Patients’ basic demographic data,neuropsychological scale assessment,and magnetic resonance data were recorded.According to clinical diagnosis,they were divided into Alzheimer’s disease group,Mild Cognitive Impairment(MCI)group and Healthy Control group.The baseline data of the three groups were statistically analyzed.The CAE based on U-NET network was constructed and the structural magnetic resonance images of healthy people in ADNI database were used for training.After training,the original pre-processed MRI images were used to generate reconstructed MRI images.The differences between them were compared,and Kmeans clustering analysis was performed according to the error functions of their transverse and coronal positions to obtain accuracy.Results: A total of 148 patients were included in this study,including 25 in the AD group,67 in the MCI group,and 56 in the HC group.By comparing the generated image with the original image,the highlighted part was used to represent the region with great difference between the two images.It was found that the highlighted part was mostly located in the sulcus,ventricle and the gap after hippocampus atrophy,and the area of the highlighted part in the AD group was larger than that in the HC group.Statistical analysis of the coronal error function and the transverse error function showed that the coronal error function in AD,MCI and HC groups decreased in turn,and there was a statistically significant difference between the two groups(P <0.05).The transverse error function of AD,MCI and HC groups also decreased in turn,and there were statistically significant differences between AD group and HC group as well as AD group and MCI group(P <0.01).The diagnostic accuracy of AD group and HC group was 88.89% after cluster analysis.Conclusion: The deep learning model trained by the normal population can be used to generate images of other patients,and through the generation errors of original images and generated images,it can be used to achieve whole-process unsupervised learning,and chieve good accuracy in classification diagnosis.Part 2 Establishment and Feature Selection of Alzheimer’s Disease Diagnosis Model Based on Machine LearningAim: Interpretability in Machine Learning(ML)modeling has always been a perplexing problem.This can make it difficult for clinicians to understand the results of machine learning predictions.In addition,different machine learning methods are different in the amount of data and the number of features,so it is necessary to choose the appropriate machine learning algorithm for different problems.Based on previous work,this study aims to compare the dichotomy/triclassification effects of different linear and nonlinear classifiers on AD and HC,MCI and HC,AD and MCI and HC,and obtain the classifier with the best performance,carry out feature selection on data,and compare the significance of feature selection results with clinical application.Method: In this study,patients who were admitted to the Memory Disorder Clinic of the First Affiliated Hospital of Zhejiang University School of Medicine between January 2017 and December 2020 were enrolled consecutively.They were presumed to have Alzheimer’s disease.Patients’ basic demographic data,neuropsychological scale assessment,magnetic resonance data,and genotyping results were recorded.According to clinical diagnosis,they were divided into Alzheimer’s disease group,Mild Cognitive Impairment(MCI)group and Healthy Control group.Through statistical analysis of the baseline data of the three groups,15 kinds of machine learning models were built.The recorded variables were used to classify the diagnosis between AD and HC,MCI and HC,AD and MCI and HC.Considering the evaluation index of the model comprehensively,the best model was selected and the feature importance ranking and feature screening were carried out.Results: A total of 251 patients were included in the study,including 64 patients in the AD group,109 patients in the MCI group,and 78 patients in the HC group.In the dichotomous diagnosis of AD group and HC group,the highest accuracy of random forest and limit tree classifiers was 98.00%,and 13 features were selected through feature screening.The features with the highest importance were Mo CA score,MMSE score and CDR total score.In MCI group and HC group,random forest classifier achieved the highest accuracy of 90.77%,and 19 features were selected through feature screening.The Mo CA score,cerebrospinal fluid volume and white matter volume were the most important features.Among the three classifications of AD group,MCI group and HC group,random forest classifier achieved the highest accuracy of 90.33%,and the features with the highest importance were Mo CA score,MMSE score and cerebrospinal fluid volume respectively.Conclusion: Compared with other machine learning methods,using random forest trees can adapt to the prediction of small sample size and small number of features.In addition,the results of feature importance ranking and feature selection are in good agreement with clinical conditions.The feature engineering method can provide quantitative indicators when looking for new diagnostic markers and provide interpretability for the model. |