Font Size: a A A

Studies On Spatial Distribution Characteristics Of Protein Structure

Posted on:2009-10-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:M HuFull Text:PDF
GTID:1100360302458537Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Proteins are essential elements of life. It is the goal for scientists from life science and other crossed fields to understand the relationship between the structure and function of proteins, and to control the protein design for the health of human beings.Each of us has tens of thousands of different kinds of proteins, each with a unique three dimensional structure corresponding to a specific function. The recent flood of structural data poses a great challenge for computer science and bioinformatics to turn data into knowledge. The representation, understanding and analysis of protein structure are new topics in visual computation and data mining.This dissertation studies the spatial distribution characteristics of protein structure focusing on three aspects. First, we study both rapid and efficient retrieval approaches to meeting with the requirement of the similarity searches in large scale of structural data. Second, we explore novel ways on similarity comparison for high level of protein structure based on automatic detections of rotational symmetries of quaternary structure. Finally, we specifically concern on the automatic identification of cage-shaped proteins, which has potential applications in biomedicine and nanotechnology.The main contributions of this dissertation are mainly as follows:We present a novel multiple criteria framework (MCF) to reduce the computation cost. Three kinds of features, which are invariant against translation and rotation, are adopted as the criteria successively during the retrieval process under MCF, including the spatial walking of protein's backbone, distance histogram and the radial distribution of the distance matrix. While the protein retrieval based on each of the above features involves only simple calculation, the intersection of their retrieval results reduce the candidate set dramatically and rapidly. Experiments using query-by-example on a representative database, including 27804 samples, demonstrate that our techniques can cut down the pruning time cost of traditional methods effectively while retaining the sensitivity. The approach is highly complementary to rapid protein structure similarity retrieval.We suggest a novel similarity comparison method of structural data of protein complex, which is of significance in annotation of protein structure and function. Patterns of rotational symmetries at the high level of protein quaternary structure are detailed exploited and automatic detection methods on different categories of rotational symmetries are proposed. Integrating the geometrical feature of tertiary backbone structure with the symmetrical topology of the assembly of protein chains, we achieve the clues of functional similarities among protein complex.We propose an efficient algorithm for screening cage-shaped proteins from large quantities of structural data automatically. Cage-shaped proteins have different kinds of topological characteristics. To solve the challenging problem to distinguish cage-shaped structure with open-hole and tunnel from numerous protein structures, we combine the PCA with digital topology technology and implement a program called CSPro for identification of cage-shaped proteins based on quaternary structure. CSPro is capable of revealing the functional shape of cage-shape more clearly and quickly than traditional visualization tools. Using CSPro, we have searched the full set of PDB and three types of proteins are retrieved with notably large central cavities inside. CSPro can be used to validate if the quaternary structure of a protein is cage-shaped in molecular simulation.
Keywords/Search Tags:Protein structure, Three dimentional structure, Spatial distribution characteristics, Quaternary structure, Feature detection, Similarity, Retrieval, Rotational symmetry, Cage-shaped protein
PDF Full Text Request
Related items