Font Size: a A A

Research And Implementation Of Protein Complex Recognition Based On Graph Neural Network

Posted on:2022-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:X M KangFull Text:PDF
GTID:2480306737478874Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Protein complexes are the modules formed by the interaction of different proteins and play an indispensable role in the biological processes of cells.Therefore,identifying protein complexes is very important for understanding cell functions and biological processes.However,the experimental methods to identify protein complexes are too complicated and have low error tolerance,and the calculation methods based on single source data are not comprehensive enough to characterize the characteristics of proteins.The calculation methods based on clustering cannot efficiently identify protein complexes due to the influence of data noise.Therefore,based on the graph neural network,this thesis conducts research and analysis on the problems of protein complex data noise and the limitation of single source data representation.Based on the above analysis and on the basis of the Graph SAGE model,the FFGraph SAGE model combined with weighted feature fusion PPI data is proposed for protein complex identification.The problem of protein complex identification is transformed into a multi-label binary classification problem.The protein complex identification method proposed in this thesis uses the feature-fused protein sequence data and PPI network data as the source data,constructs the FF-Graph SAGE model integrated with the variational mechanism for feature extraction,and then uses the obtained feature vector as input to train a two-dimensional model.Layer MLP classifier to obtain the final classification result.Experiments show that the accuracy of this method in the training of the five public data sets of HPRD,E.coli,C.elegan,Drosophila and Human are all up to93.5%,which is an increase of 5% compared with the original model.Finally,the PPI network visual analysis platform GNN-PC was designed and developed.The platform has integrated 8 graph neural network methods such as GCN and Graph SAGE for protein complex identification,and realized 5 evaluation methods such as F-measure and Accuracy.At the same time,Gephi is applied to a large-scale protein interaction network to visualize the results of the PPI network and clustering to explain biological phenomena in a better way.
Keywords/Search Tags:Protein complex, protein-protein interaction, feature fusion, GraphSAGE, VGAE
PDF Full Text Request
Related items