Font Size: a A A

Protein Secondary Structure Prediction Based On PBIL Algorithm

Posted on:2010-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:W B LiuFull Text:PDF
GTID:2120360278968532Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The protein secondary structure prediction is an important research area in bioinformatics. Predicting protein secondary structure from the point' of amino acid sequence is itself an important step to understand protein structure and protein function. In molecular biology, if we have been succeed in predicting protein secondary structure, then, we can predict the three-dimensional structure of protein in a more accurate way, and it make a contribution to also protein sequence analysis, the winding of protein structure sequence and defining the function of protein molecules. Therefore, the prediction of protein secondary structure has been the topic that the experts of protein structure of Jurists, experts of bioinformatics and experts of artificial intelligence have been involved in recent decades.In this paper, PBIL algorithm is applied to predict protein secondary structure. This method is a probability optimal method, resulting from a serious of simple rules that are learned from the ready known sequence-structure. We find the important thing of this method is the fitness function. That is to say , optimal items and penalty items. There are many relationships among amino-acids, secondary structures, and these relationships can be described as knowledge or rules. The more the optimal items and penalty items have been discovered in fitness function, the better the result will be obtained. Thus, the focus of the paper is as follows.(1)The traditional PBIL algorithm is only applicable to binary coding. This paper made a bit improvement, so that it can be available any integer encoding, and the value of information entropy here is used as a criterion to the end conditions of the evolution. (2) We establish the probabilistic data model based on CB513 protein database, and encode the amino acid residues. The fitness functions are designed respectively to single-residue, two residues, three residues, and four residues. As the experimental results show, PBIL algorithm can predict the protein secondary structure effectively.(3) It is very important to mine the laws of protein secondary structure. In this paper, we will use the continuity rules to predict protein secondary structure, and do the slice statistics for the CB513 protein database. As a result, when continuous six secondary structure is the same in the amino acid residues, the largest certainty can be reached. Experimental result tells us a good result can be attained if we put this rule into protein secondary structure prediction.(4) Chou-Fasman method is a parametric method of experience based on the single amino acid residues. When protein secondary structures produced by Chou-Fasman method and random probability P are the same, we will add incentive items or minus penalty items to the fitness function. Through this way, we are not only enhance the optimize capacity of fitness function, but also improve the prediction accuracy.
Keywords/Search Tags:PBIL, Prediction of Protein Secondary Structure, Evolutionary Algorithms, Information Entropy
PDF Full Text Request
Related items