Font Size: a A A

Research On The Alignment Algorithm Of Multiple Biological Networks Based On Topology And Sequence

Posted on:2022-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:J HuangFull Text:PDF
GTID:2480306527977849Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of high-throughput technology has led to the exponential growth of biological data,and the demand for biological data analysis is increasing.The hidden information in the network can be discovered by searching,aligning and clustering according to the network structure,which is of great significance to the research of life science.Proteins,as the material basis of all life,participate in all aspects of the life process.The analysis of protein-protein interaction networks in different species is important to understand protein function,evolutionary relationship and disease mechanism.Although protein-protein interaction networks are currently available in many species,the function of some proteins in many species is still unknown,so biological knowledge transfer through alignment is one of the approaches to solve this problem.Sequence alignment determines the similarity information by comparing the sequence fragments of different proteins,which are similar to the protein-protein interaction networks,and obtains the similarity of proteins by comparing the protein spatial structure of different species.It is generally assumed that proteins with similar network structure or high sequence similarity have higher probability of having the same function.Moreover,experiments in many literatures show that the alignment generated by adding sequence similarity information are of higher quality and have better biological function consistency,therefore,this thesis generated the alignment by combining the topological structure information and sequence similarity information of proteins.As the number of aligned networks increases,the calculation difficulty and complexity of the algorithm also increase.In order to build an efficient alignment algorithm,the main work of this thesis is as follows:(1)On the premise of building efficient alignment,we propose a global many-to-many multiple network alignment algorithm called ACCMNA,which based on seed-and-extend framework.The topological quality of the alignment was improved by considering the degree information of the network topology structure,and the k-partite weighted similarity graph was constructed according to the sequence similarity information,clustering method was used to search for the candidate cluster with the largest weighted sum to improve the biological quality of the alignments.In each iteration,the nodes that have been aligned are used as seeds to conduct extended search among their neighbor nodes,and the greedy algorithm is used to generate a candidate cluster in each iteration.Experimental results on real network and synthetic network datasets show that the alignment results generated by ACCMNA algorithm can achieve good results in terms of topological conservation and biological consistency.(2)In order to solve the problem of local optimization in the process of alignment,a oneto-one multiple network alignment algorithm SAMNA based on simulated annealing is proposed in this thesis.The algorithm also uses the combination of topological structure and sequence similarity information to construct candidate clusters by searching the weighted maximum clique in the k-partite weighted similarity graph constructed by sequence similarity information.By improving the state update method of simulated annealing part of Netcoffee algorithm,which makes the node connection between different networks more closely.The results on different datasets show that the SAMNA algorithm can produce a higher biological consistency result,and when aligning two networks,the alignment results produced by the SAMNA algorithm is also superior to the alignment results of the pairwise network alignment in terms of biological quality.(3)Based on the defects of SAMNA algorithm in topological quality,an improved version,SE-SAMNA,is proposed by integrating seed-and-extend strategy into SAMNA algorithm.The seed-and-extend strategy was incorporated into the state transfer in the simulated annealing process.The candidate clusters generated in the last iteration were used as seeds to expand in the neighbor nodes of the seed nodes,and a candidate cluster containing its neighbor was randomly selected for state update.This can improve the ratio of conservative edges in the iteration process and improve the topological quality of the alignment.Experimental results on synthetic networks and real networks show that the SE-SAMNA algorithm can significantly improve the topology quality of the alignment without sacrificing too much biological quality.
Keywords/Search Tags:Complex network, Multiple network alignment, Protein-protein interaction network, Simulated annealing, Seed-and-extend
PDF Full Text Request
Related items