| Network alignment is widely used in biological networks,especially in protein-protein interaction(PPI)networks.PPI networks are complex networks composed of proteins and their interactions,reflecting the functional connections and regulatory relationships between proteins.The analysis of different PPI networks by network alignment can reveal similarity relationships and common features among species,which can lead to a better understanding of intermolecular interactions and regulatory mechanisms within organisms.Also,network alignment can help analyze species evolutionary relationships,map phylogenetic trees,and predict protein structures.However,finding similar proteins in different PPI networks is not a simple task.Existing studies usually rely on topological information of networks to maximize the similarity of results,but the complexity and diversity of PPI networks make it difficult for existing algorithms to discover similar structures among different networks.In addition,the complexity of biological processes also leads to problems such as the lack of data or data noise in the real-world networks extracted by biological experimental methods,which can also seriously affect the results of the algorithms.Therefore,in order to solve the above problems,three new biological network alignment algorithms EmbAlign,NABGA and NGAlign are proposed in this paper,and the detailed work is as follows:1.Existing algorithms only consider part of the network structure when calculating topological similarity,without preserving the more comprehensive features of the network.In addition,the results generated by existing algorithms have large differences in topological and biological effects.To address the above problems,this paper proposes EmbAlign,a network alignment algorithm that fuses multilayer neighborhood features and word embedding models.To better measure the global topological similarity of each node,the algorithm first extracts the neighborhood information of the nodes in the network hierarchically,and then embeds the nodes into a low-dimensional vector space and represents them using a word embedding model.Meanwhile,to improve the biological significance of the results,EmbA lign incorporates sequence information to guide the matching process.Experimental results on synthetic and real networks show that EmbAlign can effectively capture the topological characteristics of nodes compared to existing network algorithms,thus improving the accuracy of the alignment.2.EmbAlign is less efficient in generating node embeddings and,like existing algorithms,consumes a lot of time in the matching phase.To address the above problems,this paper proposes a network alignment algorithm NABGA based on GCN and graph compression.Unlike EmbAlign which uses random walking and word embedding model,NABGA uses GCN to model the complex topological relationships between nodes and can quickly generate high-quality node embeddings.Meanwhile,in order to improve the node differentiation and reduce the influence of noise,NABGA incorporates structural loss function and noise loss function in the training process.In addition,NABGA performs compression operation on the network before matching networks,which greatly improves the alignment efficiency.Experiments show that NABGA can generate better results than existing algorithms as well as EmbAlign,especially in terms of preserving homology relationships and running time.3.Existing algorithms often combine topological and sequence information of the network linearly to optimize the similarity score.However,the topology and sequences of the network may be independent or even opposing to each other.Therefore,in this paper,we propose a pairwise PPI network alignment algorithm NGAlign based on graph autoencoder This algorithm uses graph autoencoder to generate embeddings for each individual species network to preserve the global and local topology of the network.Also,to enable the generated embeddings to retain genetic homology information bet ween species,NGAlign selects nodes in different networks for cross-training with the help of sequence similarity.In addition,NGAlign designs a refinement operation in the matching phase to improve the accuracy of the results by improving the neighborhood consistency of the matching nodes.Experiments on real networks demonstrate that NGAlign can generate better alignment results than existing algorithms and outperforms EmbAlign and NABGA in terms of topology metrics. |