Font Size: a A A

Study Of The Method For The Protein-protein Interactions Prediction And The Development Of The System For The Protein Supersecondary Structure Prediction

Posted on:2011-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:L YangFull Text:PDF
GTID:2120360308455310Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Identification of protein–protein interactions and protein structures is crucial for elucidating protein functions and further understanding various biological processes in a cell. It has been the focus of the post-proteomic researches. With a huge amount of protein sequences information provided by genome-sequencing projects, there is a growing demand for developing advanced computational methods for predicting protein–protein interactions and protein structures by using sequence information only. In this paper, we developed computational methods for protein–protein interactions prediction and protein supersecondary structure prediction. This thesis mainly includes the following facets:(1) We propose a sequence-based method based on a novel representation of local protein sequence descriptors for protein–protein interactions prediction. Local descriptors account for the interactions between residues in both continuous and discontinuous regions of a protein sequence, So this method is capable of capturing multiple overlapping continuous and discontinuous binding patterns within a protein sequence. When combined with the k-nearest neighbors learning system and performed on the PPI data of S. cerevisiae, the proposed method achieved excellent results. Meanwhile, the final prediction model was tested using the independent data set of the E. coli PPIs with a good performance. The performance of the combination between this novel representation of local protein sequence descriptors and SVM was also evaluated using the independent dataset from E. coli and the results indicate the stable performance of the proposed representation of local protein sequence descriptors when combining with different machine learning techniques. Given the complex nature of PPIs, the performance of our method is promising and it can be a helpful supplementary for PPIs prediction.(2) We present a new method to predict hairpins in proteins by combining the radial basis function neural network (RBFNN) with a new feature representation scheme based on autocovariance. The RBFNN is a novel feedforward neural network model and is widely applied to the fields such as pattern recognition. Different from previous methods, new feature representation scheme based on autocovariance (AC) is adopted. AC describes the level of the correlation between amino acids within a certain number of amino acids apart throughout the whole sequence in terms of their specific physicochemical property. As a result, it can take into account the neighboring residues effect, which is important for representing the hairpin information. We train and test the proposed method on a dataset of 1926 protein chains using 5-fold cross-validation. The results indicate our method yields significantly better prediction accuracy than those previously published in the literature.(3) Based on the radial basis function neural network (RBFNN) method as mentioned above, we developed a software system for prediction ofβ-hairpins in proteins. The developed protein supersecondary structure prediction software allows users to submit a protein sequence, perform the prediction of their choice and receive the results of the prediction in a few seconds.
Keywords/Search Tags:Computational method, Protein sequence, Protein-protein interactions, Local descriptors, Feature representation, Protein structure, β-hairpin, RBFNN
PDF Full Text Request
Related items