Font Size: a A A

Research On Protein-Protein Interactions Based On Integrated Neural Network

Posted on:2013-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:J CuiFull Text:PDF
GTID:2230330395965488Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Protein-protein interaction play an important role in metabolism, signal transduction andrecognition, adjust the cell cycle, complex protein complexes formation, cancer, etc. So, studieson protein-protein interactions not just help to comprehensive understanding of life processescontribute, but will facilitate the understanding of clinical therapeutics, pharmaceutical designand Drug targets search.With the rapid development of high throughput experiment technology, abundant proteinsequence information measured out. How to distinguish out which proteins are interaction,which proteins are not interaction, protein interaction sites in which amino acid residues,andatc from the mass of protein sequence information are also need to overcome problems.However,these technologies suffer from high error rates because of their inherentlimitations;moreover,the mechanism of protein interactions is complex,which is achallenge to bioinformatics research.The research on methods for protein interactionprediction based on sequences and intelligent algorithm in this article interactions. The maincontents and creative contributions of the dissertation are summarized as follows:(1) In the protein interaction sites prediction, prediction of protein-protein interaction sitesbased on features combination integration has been proposed. The study of protein-proteininteraction site prediction is a single amino acid residue. So, through the extraction of aminoacid residues of certain biological characteristics to determine whether the interaction site is adirect and effective manner. This article basis on the extracted sequence profiles, entropy,accessible surface area from protein amino acid. Then the three features were combination indifferent ways. Then four groups of samples were combined into four different training set.RBF neural network classifier training with four groups of sample sets. Finally GASENintegrated study of the four basic types of this classification. Experimental results show thatdifferent combination features have different affect to the predictions of the classifier. Base onamino acids characteristic sequence profiles, Compared with entropy, accessible surface areahas more effective on improve the prediction accuracy of the classifier. Use differentcombinations of features of the sample set to train the basic classifiers, increases the structuraldifferences between the basic classifier on the training set. In this way the final integration of the prediction accuracy increased significantly from66.79%to81.37%. The proof of thisintegrated prediction method based on combined features is effective.(2) BP neural network based on different encoding integration and RBF neural networkbased on different negative sample set integration in protein-protein interaction prediction. Inthis paper, encoding for different proteins can lead to different predictions, mainly to create asample set of three different encoding. Three different encoding are vector plus encoding,vector minus encoding and direct connection of the encoding. By comparing these three ways,direct connection of the encoding is best. Integration-based training in the basic classification ofthe different encoding Used GASEN integrated approach, the results show that integration canalso be more substantial in this way enhance the prediction accuracy.(3) There is no standard of protein non-interacting set in the protein-protein interactionstudies. Using different methods to create four groups of protein non-interacting sample set,also known as negative samples set. RBF neural network learning in four different sample sets,the results show that the farther away from the negative sample in vivo, the better predict theeffect. This result pointed out the direction for biologists to build non-protein interaction datasets. The same time, using different negative sample sets of training classifiers to improve theprediction accuracy of the integrated classifier is effective. Because in this training way alsoincreases the heterogeneity of the basic classifier. According to the above experiment can beseen, the use of different ways to increase the heterogeneity of the ensemble classifier canimprove the predictive capacity of the integrated classifier.
Keywords/Search Tags:protein interaction sites, protein-protein interaction, features, neutral network, integrated
PDF Full Text Request
Related items