With the continuous deepening of the global genome project,proteome research has become the core task of exploring the mysteries of life.Protein-protein interaction(PPI)studies are conducive to revealing the essence of life processes,and are of great significance to the development of new drugs and the exploration of disease mechanisms.In recent years,with the rise of deep learning algorithms,methods based on neural network have emerged one after another,such as gene ontology annotation,phylogenetic analysis,and gene fusion.Because of deep learning’s advanced feature abstraction capabilities,it can allow models to proceed feature extraction from data automatically.However,the accuracy of these methods largely depends on the reliability of prior knowledge and the effectiveness of feature extraction.Therefore,rational analysis and interpretation of high-throughput data with the help of deep learning methods are more in line with research needs.According to the investigation,the traditional protein-protein interaction prediction algorithms only rely on feature engineering,which can not be easy to effectively extract protein features.Simultaneously,the large-scale and high noise of the interaction data leads to high false positives in the prediction results.This paper proposes three deep learning methods for protein interaction network(PIN)prediction in response to the above problems.The leading research includes the following three parts:(1)Aiming at the problems that incomplete feature extraction,large data scale and high noise,it is proposed that a deep learning model based on the combination of multiple protein sequence representation technique and multi-head attention mechanism.The model first learns a variety of protein sequence representations through protein sequence representation technique and primary learners.Simultaneously,the multi-head attention mechanism explores the deep connection of the interaction relationship and then uses ensemble learning to analyze a series of primary learners.The final learning results are comprehensively scored.The model strengthens the correct results of the model prediction and thus improves the performance.Experiments on a series of public data sets show that the model is superior to previous methods.(2)Aiming at the problems that it is difficulty to extract topological features of network structure,a deep learning model with double features of sequence and network is proposed.It first extracts the protein sequence features,then embeds the essential biological information into the network topology,which further extracts the interactive network structure features to predict network.Besides,the model has improved.It is used as the primary learner of the ensemble model,and simple deep neural network is used as the key ensemble strategy.Experiments on a series of public data sets show the superiority of the model.(3)Aiming at the previously proposed model’s contradictions infusing protein sequence features and network structure features,a model based on the adversarial loss function(ADLF)is proposed.It improves the previous model’s network structure,which introduces the similarity index of the network topology as the adversarial factor in ADLF and regulates the weight of the loss function of the dual features in the model.Experiments show that the proposed model has a more stable training process and better performance than the traditional cross-entropy loss function and focal loss function. |