Font Size: a A A

Predicting Protein-protein Interactions From Protein Sequence Based On Multiple Feature Extractions

Posted on:2018-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:M Y DuFull Text:PDF
GTID:2370330605953554Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Protein plays an important role in almost all life activities and its function is manifested by protein-protein interactions(PPIs),so the study of PPIs has always been a key part of proteomics.The traditional experimental method can no longer meet the growing demand of the related research because only a small amount of protein pairs can be verified at one experiment.Computational methods have become the first choice for PPIs prediction.In this thesis,we constructed a more effective method to predict protein-protein interactions based on the sequence information of proteins.First of all,we accurately extracted the information contained in protein sequences,so that the interacting protein pairs and the non-interacting protein pairs can be effectively distinguished.Among a variety of protein sequence feature extraction methods,we selected three representative methods and ran the experimental comparison on them.The results show that using single feature extraction have certain limitations and the accuracy can be improved.Then,based on the support vector machines,three independent classifiers were constructed which are corresponding to each sequence coding method.At last,the Stacking method in integrated learning theory was used as the classifier fusion strategy to indirectly fuse these protein sequence feature extraction methods.Test run on the data set of 9952 Saccharomyces cerevisiae protein pairs suggests that the prediction accuracy reached 86.74% and it effectively reduced the phenomenon of high specificity.On the independent test set,this method is also superior to the existing methods,suggesting that our method significantly improved the accuracy of prediction of PPIs.
Keywords/Search Tags:protein-protein interactions, protein sequence, feature extraction, support vector machine, classifier fusion
PDF Full Text Request
Related items