Font Size: a A A

Protein Structure Class Prediction Based On Compute Intelligence

Posted on:2011-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:N N CaiFull Text:PDF
GTID:2120360308957398Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The protein structure class prediction is the process predicting protein structure class from the amino acid sequence. The basic assumption of protein structure prediction is amino acid sequence sole determines protein structure class. Study of protein structure is of great significance, not only help to understand the effect of the protein, learn protein how to perform their biological functions, know protein-protein interaction, but also have a very important role for biology, medicine and pharmacy. The decade-long Human Genome Project produced vast amounts of biological sequence data, the gap between the number of protein sequence data and structure data grow bigger, so the protein structure prediction is becoming increasingly urgent and important.This paper is to study how to build a protein structure class prediction model, to enable more accurate and effective to predict protein structure class corresponding to amino acids. The content includes the feature extraction methods of amino acid sequence, structure design of neural network and the choice of intelligent optimization algorithms.Firstly, Feature extraction of Amino acid sequence. To predict protein structure class, we must first extract the information in the sequence of amino acids, converted into the data a computer can handle, namely, feature extraction. Choice of extraction method is critical, and the information of different feature extraction method is very different. Now the main methods of feature extraction are composed of a model of amino acids (AA) composition, dipeptide composition of model, polypeptide component model, pseudo-amino acid composition (PseAA), multi- feature fusion, based on physical and chemical properties of amino acids and so on, from different angles of the amino acids extracting features. This paper uses the above feature extraction methods and conducted feature fusion. Experiments show that the different feature extraction methods for different data sets and classification models have different results.Secondly, Building of classification model. Protein structure class prediction actually is based on useful information extracting from the amino acid, by studying and analyzing the information, summed up the rule, and realize the structure of the amino acid sequence of unknown structure prediction. For the problem of protein structure class prediction which has the high dimensionality information and amount compute, using neural network is very effective. Neural network has a strong self-organization, self-learning, adaptive ability to quickly learn the features that contains in the sequence, to achieve the structure predictions. The neural network includes the structure optimization and parameter optimization. Choice of optimization algorithm is critical, different algorithms will get the different time efficiency, different algorithms correspond to different predicting accuracy. In this paper, we will compare a variety of optimization algorithms, choose a more suitable optimization algorithm. Experimental results show that parameter optimization using particle swarm optimization (PSO) can achieve good results. BP neural network in prediction of protein tertiary structure can greatly improve predicting accuracy. For the multi-classification problem of protein tertiary structure prediction, this paper presents transforming the multi-class problem to integration of several two classification problems. Through the tests showed that single-output way can effectively improve the predicting accuracy than multiple-output method. In order to find a better network structure, this article first uses a flexible neural tree (FNT) in protein tertiary structure prediction, PSO optimizes the network parameters, probabilistic incremental program evolution (PIPE) optimizes the network structure. Experiments show that the model in predicting 25PDB such a large protein data sets, the results are satisfactory. It not only resolves the previous forecast only fixed network structure and use heuristics to select the number of hidden layer problem ahead, but also selective input of high-dimensional feature, implements the effect of reducing original input dimension.
Keywords/Search Tags:Pseudo-Amino Acid Composition, Protein Structure Class Prediction, Particle Swarm Optimization (PSO), Flexible Neural Tree (FNT)
PDF Full Text Request
Related items