Font Size: a A A

Based On The Combination Of Multiple Classifiers Protein - Protein Interaction Sites Prediction

Posted on:2007-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z C ShaoFull Text:PDF
GTID:2190360182478986Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Organism's function is carried out through interaction between the biological molecules, but protein is the primary performer of biological function, therefore the recognition of protein-protein interaction sites is extremely essential to interpret protein function mechanism. Predicting protein-protein interaction sites may be determined by means of experiments, but it is very time-consuming and almost impossible. Thus the scientists have being sought after the theoretical or computational methods for predicting protein-protein interaction sites. Several methods of classifying or predicting protein-protein interaction sites based on the protein primary sequences, such as Support Vector Machine and Multiple Classifiers Combination, are investigated in this dissertation. The main contributions are summarized as follows:(1) Constructing a database that contains 4 kinds of protein-protein interaction types. The database is composed by 133 protein sequences.(2) Two definition methods of interface residues are used to demarcate the residues in protein sequence as interface residues and non-interface residues. Based on sequence profile information of residues in the protein sequences, we constructed many information windows which are made up of several contiguous residues. These windows are classified by Support Vector Machine. The results show that:1) The classification accuracies of the windows which contain many contiguous residues are higher than that of the windows which contain individual residue. The best result is the accuracy of the 7 contiguous residue windows.2) The classification accuracies of Support Vector Machine are not higher for any more contiguous residues. The information fusion problems exist between the contiguous residues. When contradictory or uncoordinated, the classification accuracies will reduce;when supplementary, the classification accuracies will enhance.(3) The methods of multiple classifiers combination are proposed to classify protein-protein interaction sites. We have investigated the theoretical framework ofmultiple classifiers combination and have used Voting Based On Information algorithm and Dynamic Classifier Selection with Local Accuracy (DLS-LA) algorithm to predict protein-protein interaction sites. The results show that: two multiple classifiers combination algorithms enable the classification accuracies to enhance, and multiple classifiers combination could reflect the more protein-protein interaction sites information in the certain degree.
Keywords/Search Tags:protein-protein interaction sites, residue window, Support Vector Machine, Multiple Classifiers Combination
PDF Full Text Request
Related items