Font Size: a A A

Research On The Algorithm Of Mining Biological Network Disease Module Based On Topology And Semantic Similarity

Posted on:2020-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:H L ZhuFull Text:PDF
GTID:2370330575954498Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Identifying disease modules in biological network has attracted a lot of attention,since accurate prediction performance can help to comprehend pathogenesis of complex diseases and facilitate disease diagnosis and therapy.Currently,the study of the interaction between human proteins has gradually become one of the most effective ways to reveal the mechanism behind complex diseases,but there are still a lot of missing and wrong existing protein interactions.So many disease module mining algorithms try to use other biological or topological data to adjust the protein network for disease module mining,but these methods do not take into account the lack of interaction and error in the protein network.Therefore,this thesis can more accurately identify disease modules by effectively combining multiple biological data resources.The main research work of this thesis is shown as follows:(1)This thesis proposes a method for identifying disease modules on protein network based on connectivity and semantic similarities(termed as IDMCSS).First,the topological similarity and semantic similarity between candidate proteins and disease proteins are used to increase and delete some possible missing and wrong protein interactions,and the existing protein network structure is adjusted.Then,the candidate protein with the largest sum of topological similarity and semantic similarity is expanded on the adjusted protein network until the expanded candidate protein set no longer significantly enriches the biological information.The protein network adjustment strategy runs through the whole algorithm.The local structure of the network is adjusted by using the topological similarity and semantic similarity between the candidate protein and the disease protein before each expansion of the candidate disease protein.In the experimental part,the IDMCSS proposed in this paper is compared with other algorithms in the asthma disease dataset.The experimental results on the asthma dataset proved the validity of the IDMCSS algorithm.(2)This thesis proposes an algorithm for identifying disease modules on bilayer network based on topological,semantic and phenotypic similarity(termed as IDMCSPS).Based on the work IDMCSS,this thesis uses the constructed protein-phenotype network to replace the protein interaction network,and effectively uses the protein interaction,phenotypic similarity and protein phenotype correlation data to mine disease modules.First,a protein-phenotype bilayer network was constructed.Then,the synergistic filtering method is used to increase the protein-phenotype relationship,and the existing protein network structure is adjusted by using topological similarity and semantic similarity.Finally,the topological similarity and semantic similarity between the candidate proteins and the disease proteins and the phenotypic similarity between the candidate proteins and the disease similar to the disease being studied are calculated on the adjusted bilayer network.Candidate proteins with the highest sum of topological similarity,semantic similarity,and phenotypic similarity are extended to the disease module until the expanded candidate protein set no longer significantly enriches the biological information.The bilayer network adjustment strategy runs through the entire algorithm and adjusts the bilayer network before each expansion of the candidate disease protein.In the experimental part,the algorithm IDMCSPS was compared and analyzed with various disease module mining algorithms on the asthma disease dataset.The experimental results show that the disease module extracted by IDMCSPS algorithm significantly enriches the biological information of asthma,and compared with the disease module mined by the algorithm IDMCSS,there are more discrete known disease proteins that are expanded into the disease module.Because there are a large number of errors and deletions in the existing protein network,the topological similarity of some disease proteins is relatively low,and with the integration of phenotypic similarity data,the ranking of these disease proteins is improved.
Keywords/Search Tags:Disease module, Topological similarity, Semantic similarity, Phenotypic similarity, Protein interaction network, Bilayer network
PDF Full Text Request
Related items