Font Size: a A A

Research On The Algorithm For Predicting Protein Complexes Using Logistic Regression Model Combined With Local Structural Information

Posted on:2018-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:G ZhangFull Text:PDF
GTID:2310330512493159Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The protein complex is a proteome that interacts with each other or performs a specific molecular function together,and its exploration is an important theoretical basis of understanding the cellular process,life activities and disease mechanism.Identification of protein complexes is a very popular research field in the field of proteomics.Using computational methods to identify protein complexes can not only save a large amount of time and resources,but also it can identify potential protein complexes.Existing protein complex detection methods can be broadly divided into two categories:unsupervised and supervised learning methods.Most of the unsupervised learning methods take the assumption that protein complexes are in dense regions of protein-protein interaction(PPI)networks,in spite of the fact that many true complexes are not in dense subgraphs.Supervised learning methods extract features from true complexes and train a classification model to guide the complexes searching.However,insufficient extracted features,noise of PPI data and the incompleteness of complex data can make the classification model imprecise.Here,we propose a new robust score function which combining logical regression classification model with local structural information.We selected 24 features to improve the accuracy of the classification model.It can reduce the negative effects of PPI data noise by combining the classification model with local structural information.Based on the scoring function,we finally designed a novel complex detection method which used the same parameter settings for all PPI networks to improve the simplicity of the algorithm.Compared with unsupervised learning methods,the experimental results based on four large-scale yeast data sets show that our algorithm performs better than other algorithms according to the MMR score and the comprehensive score.Meanwhile,the proposed algorithm is second only to the ClusterEPs when compared with supervised learning methods.Finally,the results of GO analysis indicated the feasibility of the proposed algorithm to detect the potential protein complexes.
Keywords/Search Tags:Protein-Protein Interaction Networks, Protein Complex, Complex Detection Method, Logical Regression, Supervised
PDF Full Text Request
Related items