Font Size: a A A

Research On Multimodal Compound Protein Interaction Prediction Combining Interaction Network Information

Posted on:2022-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:X X YuFull Text:PDF
GTID:2504306773481194Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Prediction of compound-protein interaction(CPI)is a major topic in the field of drug discovery.In the process of drug development,it is necessary to screen out samples that can bind proteins from large-scale compounds.This step is much more expensive than computational methods using chemical experiments.With the continuous development of various scientific experiments,a large amount of biological data can be generated,which can be quickly and effectively used to screen compounds through computer modeling.Therefore,improving the prediction of compound-protein interaction can narrow the search space of compounds and play a crucial role in the drug development process.Aiming at the problem that the receptive field of the existing compound characterization methods is limited by the node topology,a new graph neural network model is proposed in this paper.First,the graph convolution layer is used to integrate the neighborhood features of the compound graph,and then the multi-head self-attention layer is used to extract global information from the feature vectors of all nodes.Residual connections are used between layers to eliminate the problem of information loss caused by deepening the number of layers.This model breaks through the feature extraction method of compound graph based on topological distance,and makes each atomic node participate in the calculation of all atomic nodes in the compound graph through a global attention mechanism,which means that the implicit connection between distant atoms still remains.It can play a role and has a stronger characterization ability than existing methods.Aiming at the problem that existing compound-protein interaction prediction models fail to effectively integrate semantic features with topological information in interaction networks,this paper proposes a new binary classification model.First use the model proposed in the first point for compound graph feature extraction,use the pre-trained model elmo for protein sequence feature extraction,and explicitly extract the degree of network nodes from the compound-protein interaction network,which is added as the centrality encoding to the initial feature.Then the two feature matrices are sent to the cross-attention module for information fusion,and the correlation encoding of network nodes are extracted as the bias term in the cross-attention module.A feedforward layer is then used to further extract interaction features.Using residual connections and regularization between layers makes the model more stable.The final output prediction result.The model innovatively fuses information from different modalities such as the network topology information and the semantic features of network nodes,which increases the effective information of the model and improves the accuracy of the prediction results.This paper conducts extensive experiments on multiple datasets and compares it with several mainstream prediction methods.The experimental results show that the model proposed in this paper can accurately predict the interaction between compounds and proteins,and the encoding of the network topology can greatly improve the performance of the model.
Keywords/Search Tags:compound, protein, attention, CPI
PDF Full Text Request
Related items