Font Size: a A A

Prediction And Mechanism Study On Disease Association Of Single Amino Acid Polymorphisms

Posted on:2011-09-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:S Y LiFull Text:PDF
GTID:1114330332467074Subject:Chemical informatics
Abstract/Summary:PDF Full Text Request
Single amino acid polymorphisms (SAPs) exist universally in eukaryotic genomes and highly related with human genetic diseases, hence play a major role in pharmacogenomic.The identification of SAPs in specific genes relevant to drug efficacy, toxicity and metabolism will help to establish optimal therapeutic strategies for individual patients. Therefore, the study of SAPs is thus believed to be critical for the better understanding of the disease cause at the molecular level, and become one of the most active areas in genome wide studies.This dissertation focuses on disease-associated SAPs using vary mathematical and bioinformatics approaches to discover the molecular cause of human genetic disease. First, new sets of sequence features were explored and a concise, accurate and reliable identification model of deleterious SAP sites was built. The built model was applied to actual samples and used to predict function of new SAPs; with the advantages of cost and time saving, the results provide a strong theoretical support and candidate of research target for later experimental validation. After then, from the aspect of post-translational modification (PTM), statistic of disease-association of disrupted PTM sites by SAPs is performed. This study included all kinds of major PTM types, which will comprehensively interprete the relation of disruption of PTM sites with disease-cause mechanisms. At last, we focused on a specific type of PTM-palmitoylation. The relation of disrupted palmitoylation sites with human disease were carefully studied, and brought out some further insights of the mechanism of molecular cause of human genetic disease.In Chapter 1, a brief introduction for the backgrounds, data resources and prediction methods of deleterious SAP identification study were provided. Then, the study procedure and mathematical methods used in this dissertation were presented.In Chapter 2, a concise and promising deleterious amino acid polymorphisms identification method, called SeqSubPred, was developed. This method based on 44 features solely extracted from protein sequenc and achieved surprisingly good predictive ability without resorting to homology or evolution information, which is frequently utilized in similar methods and usually more complex and time-consuming in use. After then,2127 unclassified single amino acid substitutions in SwissProt database were identified whether or not disease-associated by our method which will provide a further annotation support for later experimental validation. In addition, a web server for this identification method was developed, also called SubSeqPred, requiring only protein sequence and substitution sites information as input.In Chapter 3, relation between human genetic disease and disruption of all main types of PTM by SAPs is estimated. The experimentally verified sites of PTMs were searched against amino acid substitution databases with the goal of investigating whether or in which ways changes of PTMs are affected by inherited and somatic disease SAPs.We found that about 4.5% of deleterious amino acid substitutions (3.9% of unique sites) may affect protein function through disruption of PTMs. On the other hand, about 2% of neutral polymorphisms may be affecting PTMs. These numbers further indicate that PTMs are not the major cause of human genetic disease. However, we had still found 238 post-translational modified sites in human proteins whose mutation was causative of disease. In total,1,289 modification sites were found to be in the close proximity to the inherited disease mutations and represent candidates for further experimental verification.In Chapter 4, based on the works above, we carried out an in-depth study against the relation between disruption of palmitoylation sites by SAPs and disease. First, protein sequence features and random forest modeling method were adopted to build a simple and effective identification model for palmitoylation sites. Then, all human single amino acid substitution sites were identified by this method. A number of disease-related single amino acid substitutions were predicted to be pamitoylation sites. By querying literature, five of these sites were confirmed to be related with pathogenicity, which on one hand proved the practicality of our built model, on the other hand brought some effective insights into the explanation of pathogenic mechanism of these disease related substitution sites.In Chapter 5 and Chapter 6, other two bioinformatics studied in related area of drug discovery were briefly introduced as the mathematical modeling basic of the SAP modeling study. They are identification of T-cell epitopes and quantitative study on prediction of protein-drug(ligand)binding affinity.
Keywords/Search Tags:Single Amino Acid Polymorphisms, Genetic Disease Association, Post-translational Modification, Palmitoylation, Mathematical Modeling, OnlinePrediction Server
PDF Full Text Request
Related items