Font Size: a A A

Research Of Protein Secondary Structure Prediction Method Based On PBIL Algorithm

Posted on:2007-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:L YuFull Text:PDF
GTID:2120360215969938Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The prediction of the protein structure becomes an important research domain in post-genome era. It is very important to predict the protein secondary structure, which is the bridge between the primary structure and the three-dimensional one. Predicting the second structure of a protein from its amino acid sequence is a global optimization problem and lacking powerful optimization method is the key obstacle to this problem. But Evolutionary Algorithms (EAs) are a class of stochastic search algorithms, which have some advantages in solving combinatorial optimization problems. In this paper, an evolutional algorithm named PBIL is used in predicting the protein secondary structure. The main work is summarized as follows:Firstly, this paper analyzes the basic principle and disadvantage of PBIL algorithm, and proposes an evolutional algorithm based on decimal probability learning, which is named as D-PBIL. The algorithm adopts a kind of decimal code, and supports multi-allele structure, so it can avoid the two disadvantages of PBIL as the code redundance and probability conflict, and improve the algorithm's efficiency. Information entropy is introduced to evaluate population evolutionary degree, and it makes the algorithm to be used more easily, also the result will be steadier. In the end, the algorithm is validated and analyzed by JSP and FJSP. The conclusion is that D-PBIL algorithm has good capability of combination optimization, few parameters, and is easy to design a code. Moreover, it is more applicable and convenient than genetic algorithm.Secondly, based on D-PBIL algorithm, a prediction method of protein secondary structure is proposed, which is called PPDPsec. The CB513 data set is used for a learning set in PPDPsec, based on the probability of amino acids snippets turning into snippets of three kinds secondary structures, the optimal function is designed. Then the proteins CASP4 data sets are used as the prediction object to validate the method. It comes to the conclusion that PPDPsec has good prediction accuracy in predicting the known-structure proteins and also has a good result in predicting the unknown-structure proteins. The more rules of proteins are used in the method, the higher accuracy of prediction the method get.As an algorithm of combination optimization, D-PBIL has good capability of optimal search. The PPDPsec, which is the prediction method of protein secondary structure based on D-PBIL algorithm, is a new attempt in predicting protein secondary structure from the position of abiological statistic and is a method of single sequence. In this method the principle is simple, the program is easy to be realized and also the applicability is good. Because D-PBIL algorithm has the multi-allele structure, the prediction method has good expansibility, too.
Keywords/Search Tags:Evolutionary Computation, PBIL Algorithm, Prediction of Protein Secondary Structure
PDF Full Text Request
Related items