| In recent years,with the high incidence of cancer,cancer is no longer a rare disease.The complex diseases represented by cancer have become an important subject for biomedical research.In the prediction of disease genes,bioinformatics research is conducive to the discovery of the pathogenesis of these complex diseases which promote the development of drug development.Therefore,the field of bioinformatics has given the disease gene prediction more and more attention.Researchers have integrated massive amounts of biological data into a variety of databases,while the analysis of the interaction between genes is urgently needed.However,the current bio-database-based disease gene prediction algorithm can not satisfy this need.Therefore,this thesis focuses on constructing a gene similarity network,and proposes a network aggregation algorithm for constructing a gene similarity network based on the relationship between gene expression and DNA methylation.Based on this network,a disease gene prediction algorithm is given.In this thesis,the proposed network method and disease gene prediction algorithm are used to predict disease genes and to provide support for the treatment of the disease.This thesis firstly describes the significance and the development of disease gene prediction algorithms,summarizes the network knowledge and the existing disease gene prediction algorithms,analyzes the topological characteristics of the genetic similarity network and the biological information related to the disease gene.Next,typical disease gene prediction algorithms are studied.Finally,a network construction algorithm for aggregation of gene similarity networks is proposed.At the same time,a disease gene prediction algorithm is given,which makes the prediction effect of disease gene be improved.(1)In order to predict the disease genes,the gene similarity network need to be constructed firstly.Through the analysis of gene similarity network,this thesis proposes a similarity network aggregation algorithm.Firstly,based on the relationship between gene expression and DNA methylation,the two similarity networks are constructed respectively.Then,these two networks are polymerized.The nodes in the network represent the genes.The edge represent the interaction relationship superposition of genes expression and DNA methylation.Recently,the network construction algorithm has not been applied to build a gene similarity network.The experimental results show that the genetic similarity network obtained by the aggregation algorithm has high topological characteristics,which indicates that the network is dense and can be used to effective predict genes.(2)Next,based on the research of MCL algorithm and the multiple biological information,a disease gene prediction algorithm based on MCL is proposed.In the algorithm,the MCL algorithm is used to cluster the network into network modules,and then the gene scoring method is used to calculate the candidate gene score to complete the disease gene prediction.The algorithm scores the gene on the basis of the MCL algorithm and comprehensively measures the correlation between a candidate gene and the disease.Using this algorithm for gene prediction,the experimental results show that the disease genes are more likely to focus on high scores module.Morever,with the loss of prior knowledge,the algorithm also achieve a good performance.This thesis screens a portion of the initial gene set from the OMIM database and the PPI network as experimental data.Using a LOO-CV,the AUC value remains above 0.6 with 50% prior knowledge lost.At the same time,compared with the RWR algorithm,the algorithm proposed in this thesis achieves more accuracy in predicting the disease gene. |