| With the biomedical data to get explosive growth,the rapid development of bioinformatics is also constantly analyzing the hidden information behind these data,related research has becomed a hot field.Identification of pathogenic genes is a fundamental challenge on human health research,and it is necessary to understand the association between genotype and disease phenotype through biological networks.Massive biological data are stored in a variety of databases that does not have a unified standard,and the biological networks are built on the foundation of these data,research on biological networks is also on the exploration of complex life activities.The association between disease phenotype and genotype has farreaching implications for predicting pathogenic genes and finding the disease that the gene causes.According to the modularity of the disease,functionally related proteins can cause similar diseases.Thus,the study of disease gene association methods is mostly focused on the computational network,which integrates the protein interaction network,the disease phenotype similarity network and disease-gene mapping network.Online Mendelian inheritance(OMIM)is a database of human genetic diseases and related genes.Based on OMIM we calculated the formation of disease phenotypic similarity network and disease gene mapping network,coupled with the protein interaction network,to integrate complex heterogeneous networks.This paper introduces random walk with restart algorithm,and we form new method YSearch after improving the the web page sorting algorithm Trust Rank.Firstly,the algorithm is used to construct the network to select the prior knowledge(seed set)of the disease(gene).The TR score is computed by iterative processing of the random walk of the global network.Then,sorting the candidate genes and diseases and achieving the predictive function.We utilized a cross validation to evaluate the results of the algorithm,making use of ROC curve and other methods to compare the experimental results to prove the good performance of the algorithm.In this way,we designed and developed the genetic disease search engine platform YSearch,the whole system is built on the spark data platform which based on memory calculation,the data stored in HBase,and the system related to the introduction and optimization.The algorithm and platform of this paper can provide new ideas for clinical research such as disease diagnosis and treatment. |