With the improvement of the medical health conditions and peoples’ living standards,some simple diseases have been effectively controlled,while complex diseases have become the primary factors threatening human health.These complex diseases are often caused by multiple pairs of micro-effect genes and environmental factors,with the obviously characteristics of genetic heterogeneity,ethnic differences and phenotypic complexity.In addition,with the development of high-throughput sequencing technology,biochip technology and nanotechnology,a large number of genetic data,such as Single Nucleotide Polymorphism(SNP),has been generated.However,due to the high dimensionality and small sample size of the SNP,the research faces many challenges such as “dimension disaster” and “combination explosion”.In this thesis,we improve the ant colony algorithm to identifying epistatic interactions in genome-wide bio-big data,which is of great significance for further understanding the mechanism of the generation and development of complex diseases.Furthermore,the research lays the foundation for risk assessment,treatment and drug development of complex diseases.The main contributions of this thesis are outlined as below:(1)An improved ant colony optimization algorithm is proposed for identifying epistasis.Firstly,a threshold is introduced into the search strategy to divide the probability selection function into two parts,i.e.,stochastic path selection strategy and probabilistic path selection strategy,which on one hand increases the diversity of decision rules,controls the convergence speed of the algorithm on the other hand.Secondly,a multi-objective optimization function is proposed to evaluate the associations between SNP-SNP combinations and the phenotype,which effectively and efficiently improves the accuracy of the algorithm.The algorithm is performed on real data sets,results demonstrate that it is promising in identifying epistasis and can obtain more real SNP interaction pairs.(2)An adaptive ant colony optimization algorithm is proposed for identifying epistasis.Firstly,in order to reduce the introduced parameters,an adaptive adjustment parameter is introduced,which effectively controls the convergence speed and improves the processing power of the algorithm.Secondly,the memory based strategy and post-processing strategy are designed to process the optimal solutions identified by the algorithm,which yield more accurate and time-saving ways for identifying epistasis.The algorithm recognizes many SNP-SNP interaction pairs on real data sets,which may provide more directions for further research on complex diseases.(3)A heuristic ant colony optimization algorithm is proposed for identifying epistasis.Firstly,the heuristic information is introduced to guide the search with linear time,which improves the recognition ability of the disease model.In addition,an adaptive adjustment of pheromone update strategy is designed for balancing the pheromone concentration of SNPs.Experiments verify the stability and superiority of the algorithm and find that the final result has important biological significance. |