Font Size: a A A

Combining Feature- And Template-based Strategies To Predict Nucleic Acid-binding Residues In Proteins

Posted on:2016-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:X X YangFull Text:PDF
GTID:2180330461993809Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Protein-nucleic acid interactions play important roles in numerous life processes such as gene expression and gene regulation. The computational prediction of nucleic acid-binding regions in proteins is useful for illuminating the mechanism of this type of interaction. The majority of existing algorithms individually utilize feature- or template-based prediction strategies to recognize nucleic acid binding residues; however, the predictive power of these algorithms remains to be further improved. Considering the limitations of existing algorithms, we established structure- and sequence-based hybridized algorithms by integrating the aforementioned strategies.The RBRDetector algorithm combines the local and global similarities of protein structures to predict RNA-binding residues. First, we developed a feature-based prediction model based on local similarity, in which evolutionary conservation, local geometric features and network topological features, which were integrated with the local microenvironment of the target residue, were considered as the inputs of the support vector machine. Furthermore, utilizing global similarity, we constructed a template-based predictor to recognize the putative RNA-binding regions by structurally aligning the query protein to the RNA-binding proteins with known structures. Finally, we combined the results of the two predictors using a piecewise function, which greatly improved the prediction accuracy. By validating our predictors with diverse types of structural data, including holo structures, apo structures, and theoretical models, we demonstrated that the RBRDetector algorithm had clear advantages over existing structure-based algorithms.Although we may more accurately recognize nucleic acid binding residues using structural information, the limited number of protein structures restricts the application scope. In contrast, sequence-based predictors may have a broader application in real situations. Thus, we developed a hybridized algorithm, SNBRFinder, based on sequence information to detect nucleic acid binding residues. This algorithm utilized the profile hidden Markov model to retrieve the reliable template of the query sequence and used a position-specific scoring matrix and its complementary sequence features to characterize the residue sequential microenvironment; finally, the outputs of these methods were combined as the prediction results. To verify the effectiveness of this algorithm, we applied a variety of nucleic acid binding protein datasets to conduct performance assessment. The results showed that our sequence-based template method is comparable to its structure-based counterparts and that the incorporation of additional sequence features can effectively improve the accuracy of our feature-based method. Leveraging the complementarity between the component methods, the hybridized algorithm SNBRFinder shows a greater ability to identify nucleic acid binding residues.
Keywords/Search Tags:protein-nucleic acid interactions, nucleic acid binding residues, structural alignment, hidden Markov model, position-specific scoring matrix
PDF Full Text Request
Related items