Font Size: a A A

Research On Epistasis Detection Based On ACO And Random Forest

Posted on:2019-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:D WuFull Text:PDF
GTID:2428330548458932Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With modern biotechnology,especially gene-sequencing technology's develop at full speed,biology has been the stage in the brand-new vigorous development since 1990 s.One of monumental events is that the International Human Genome Project,known as life science's "moom shot",was launched in 1990.By the end of 2003,the human genome's sequencing was formally completed.With the rapid development of computer science and biology,bioinformatics boomed in the past 10 years.Bioinformatics makes use of computers to process data and study biological problem.The whole-genome association studies generally mean that searching single nucleotide polymorphism related to complex disease.GWAS(Genome-wide association study)has already borne fruit,however,for most diseases,GWAS only accounts for a little heritability.The genetic factors of many diseases and traits are not found,which is called missing heritability.One possible reason of missing heritability is that additive model in standard GWAS can not fit gene interaction.The additive model assumes that genetic variation contributes to complex disease independently.In practice,this assumption may not be true because there exists epistasis between genes.We proposed one method to detect epistasis based on ACO and Random Forest through summarizing previous algorithm.ANTRF takes obb score of random forest as evaluation criterion for SNP set,which reason is that the SNP set containing morbific SNP differentiate diseased population and normal population.Random forests have been widely used to detect epistasis and obtain achievement.However,some researchers raised question about its limit.ANTRF integrates random forests into ACO to avoid its limit and exert its advantages.We adopt ACO as algorithmic framework,where we use appropriate ants path selection and path evaluation.In addition,good heuristic information plays a role in improvement of ACO.Therefore,we put forward one method named SNPRANK to generating heuristic information.On the one hand,the result of this algorithm can be used to select features;on the other hand,it also can be used in ants path selection.Experiments prove that SNPRANK has a good effect.In order to get better time efficiency and higher detection precision,we firstly use SNPRSNK to filter a part of noisy SNP to avoid their distractions;then,merge heuristic information generated by SNPRANK into ACO framework.Experiments prove that algorithm has a good effect.In the future work,we will enhance the robustness and effect of the algorithm by studying better heuristic information and local search algorithm.
Keywords/Search Tags:Epistasis, ACO, Random Forest, ReliefF, complex disease
PDF Full Text Request
Related items