Font Size: a A A

Research On Prediction Of Protein Contact Maps

Posted on:2008-08-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:G X LiuFull Text:PDF
GTID:1100360212997677Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Computational intelligence (CI) is a computing methodology from nature, which simulates and researches the intelligent behavior from the lowest level of the creature. CI develops the traditional style of computation and provides a new a pproach for solving complex problems.It has the capability of reasoning and learning from the infinite and inaccuracy environment CI is the powerful computational tool for building more intelligent system..There are several main methods in it :fuzzy system, artificial neural network, genetic algorithms,And Artificial Immune System. Computational Intelligence(CI)has been advancing rapidly in Recent years,and found applications in many fields, such as pattern recognition, machine learning,knowledge discovery,data mining.A great usage of it is in a newly evolved branch of science: bioinformatics. The accomplishment of the Human Genome Project (HGP), and the completion of more other genomes, Computational Intelligence will play bigger roles in computational biology and bioinformatics.CI have been used for analyzing the different genomic sequences, protein structures and folds, and gene expression data.At the same time, CI have been used for a fast sequence comparison and search in databases, automated gene identification, efficient modelling and storage of heterogeneous data, etc。Since the work entails processing huge amounts of incomplete or ambiguous biological data, learning ability of neural networks, uncertainty handling capacity of fuzzy sets and searching potential of genetic algorithms are synergistically utilized. Computational intelligence poses several possibilities in Bioinformatics,particularly by generating low-cost, low-precision, good solutions.The proteins,macro-moleculesen coded by DNA,chemical unit of which is the amino acid,attach greatly close importance to biological activities of the mankind.By combing some amino acids, a continuous long chain with spatial structure formed and the life, proteins come into being. The proteins are the basic elementary component of while they are responsible for carrying through the functions of body cell. The genome sequencing result demonstrates that in the human body there are about one hundred thousands kinds of diferent proteins, every of which possesses unique function and purpose, that realizing the function protein is completed through the efect between the structures of proteins and other molecules. The result tells us knowing about the structures of proteins is the key to grasp the function in grain. From the above, we can say that it is not exaggerated that the problem of protein structure prediction is one of the magnificent research domains of bioinformatics in twenty-first century. In the era of post-genome,the sharp increase of the biological information urges the batch processing methods by computer, which leads to the birth of the Bioinformatics. Currently, the main research field of the bioinformatics now is gene regulation and the study of protein structure and function, and protein structure prediction is the preliminary step of the latter work. In which secondary structure prediction has been brought to maturity, whereas the 3D-structure prediction of protein is still at its early stage and needs further investigation. The present protein structure prediction methods can be simply classified as ab initio prediction based on minimal energy principle and the way of protein correlative information learning. Each of them has its preponderances and shortcomings: the energy minimization method is more adaptive and highly independent, but it is hard to formulate the energy function. Even if a comparatively precise energy function is made, the grand compute scale caused by numerous parameters and the tiny energy difference between the formations which is only on the level of 1kcal/mol,make the prediction difficult. The prediction using correlative information is more precise, especially for the homological proteins, but it is extremely restricted by the known protein structure database, and is less universal. the accuracy of the methods which predict the three-dimensional structures directly from the amino acids sequences is not high enough, so intermediate steps, such as residue contacts prediction , and residue spatial distance prediction, were put forward and have been developed rapidly recently. Contacts between protein residues constrain protein folding and characterize different protein structures. Therefore its solution may be very useful in protein folding recognition and de novo design. It is much easier to get the major features of the three-dimensional (3D) structure of a protein if the residue contacts are known for the protein sequence, and methods that reconstruct the protein structure from its contact map have been developed. A similarity based on contact map overlaps is the only approache for structural comparison that does not require a pre-calculated set of residues equivalences as one of the goals of the method.There are a variety of measures of residues contact used in the literature. Some use the distance between the Cα-Cαatoms , while others prefer to use the distance between the Cβ-Cβ. Contact maps are two dimensional, binary representations of protein structures. For a protein with N residues, the contact map for each pair of amino acids k and l (1≤k, l≤N), will have a value C(k,l)=1, if a suitably defined distance d(k,l)
Keywords/Search Tags:Computational Intelligence, Bioinformatics, Protein Structure Prediction, Prediction of Protein Contact Maps, Deviation Units Recurrence Neural Network, Transiently Chaotic Neural Network, Artificial Immune System, Clonal Selection Algorithm
PDF Full Text Request
Related items