| Objective:This study aims to compare the typeⅠerror and power between Variational Dropout deep learning model and SKAT by simulating data of different sample sizes.Additionally,we simulated some combinations of different sample sizes and variables,and then we created a deep learning model by using cross-validation and Dropout methods.In addition,we optimized this deep learning model by using Variational Dropout method and we compared the computational efficiency and memory usage between the two models.Finally,we applied the Variational Dropout model to median T2* measurements of the caudate nucleus in UK Biobank to explore the predictive gene regions associated with neurogenerative diseases,such as parkinson’s disease and alzheimer’s disease.Methods:In this study,data were simulated with different sample sizes(N=500,1000,2000,3000,4000,5000,6000,7000,8000,9000,10000)and variables with different predictive effects(linear additive and non-linear effects)by Monte Carlo method.We constructed a Variational Dropout deep learning model and the SKAT model,then used the simulated data into this two models to calculated and compared the Type Ⅰ error and power of them.In addition,we simulated data with different combinations of sample sizes(N=500,5000,10000)and the number of variables(M=10,1000,10000)to further compare the running time and memory of general deep learning model and the Variational Dropout deep learning model.For real data application,we used median T2* measurements from caudate nucleus and genotype data in UK Biobank database and applied Variational Dropout deep learning model to select the predictive regions associated with disease.Results:The simulation results showed that both the Variational Dropout deep learning model and the SKAT model can effectively control typeⅠerror at different significance level for different sample sizes.When predictive effect is linear,the power of this two models were basically equal,even with the growing of sample sizes.While,the variational Dropout deep learning model significantly outperformed the SKAT model when the predictive effect is non-linear.In addition,for computational efficiency and memory usage,the Variational Dropout deep learning model had higher performance than general deep learning model.Considering a situation of higher sample size(N=10000)and variables(M=10000),general deep learning model need 260.55(95%CI:257.79,263.31)minutes and 5.36(95%CI : 5.28,5.43)GB,whereas Variational Dropout deep learning model a mere 6.33(95%CI : 6.28,5.38)minutes and 1.97(95%CI:1.96,1.98)GB.For real data application,a total of 7352 individuals and 21712 genes remained in our analyses.We applied Variational Dropout deep learning model to screen the data,and finally there are three genes which are under the significant level(1-5).The three genes are 3912( = 9.26-10),1( = 9.42-7),3912-1( = 8.29-6).Those three genes relate to our outcome and participate in the stable and repairment of the brain nervous system which is bound up with Neurodegenerative diseases.Conclusion:Variational Dropout deep learning model is effective to gauge the linear and nonlinear effects,and constantly control typeⅠerror.Additionally,compared with general deep learning model,Variational Dropout deep learning model have a lower time consumption and memory usage,which enables us to apply the prediction model to our real life.Finally,we verified some genes associated with Neurodegenerative diseases in our real data application. |