| With the development of SNP detection technology,Genome-Wide Association Studies(GWAS)have become a powerful tool for the study of complex human diseases.The basic principle of the traditional GWAS method is simple.Based on selected cases and control samples,different statistical methods,such as chi-square test,T test,Fishers precise test,and logistic regression analysis,only evaluate the statistical significance between a single SNP and disease.However,single nucleotide polymorphisms cannot fully explain the pathogenesis of complex human diseases.Genome-wide association studies(GWAS)had been used to screen for susceptible genes and heritability deletions had been found.One possible sources of risk is that the risk of certain SNPs may vary significantly in the presence of another risk factor,which is also called epistasis.So,detecting high-order epistasis is important for analyzing the occurrence of complex human diseases and explaining missing heritability and has significance for the detection,prevention and treatment of diseases.However,there are various challenges in the actual high-order epistasis detection process because of the problem “small sample size problem,” diversity of disease models,etc.This paper proposes a multi-objective genetic algorithm(Epi MOGA)for single nucleotide polymorphism(SNP)epistasis detection.The new methods have some innovations as follows:First,the single objective search mode of traditional genetic algorithm was changed to multi-objective search mode.The K2 score based on the Bayesian network criterion and the Gini index of the diversity of the binary classification problem were used to guide the search process of the genetic algorithm.Second,in order to reduce the dependence of genetic algorithm on the initial population,the search process was modified and multiple initial populations were used to effectively reduce the risk of local optimization.This paper uses MATLAB programming to establish the above method.Experiments were performed on 31 simulated datasets of different models and a real Alzheimer’s disease dataset.The results indicated that Epi MOGA was obviously superior to other related methods in both detection efficiency and accuracy,especially for small-samplesize datasets,and the performance of Epi MOGA remained stable across datasets of different disease models.At the same time,a number of SNP loci,2-order epistasis and3-order epistasis associated with Alzheimer’s disease were identified by the Epi MOGA method,indicating that this method is capable of identifying high-order epistasis from genome-wide data and can be applied in the study of complex diseases. |