Font Size: a A A

Bioinformatics Prediction Of The Likelihoods For Protein Crystallization And Solution Structural Determination

Posted on:2018-09-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:H L WangFull Text:PDF
GTID:1360330542968182Subject:Chemical Biology
Abstract/Summary:PDF Full Text Request
Knowledge of protein 3D strucuture has significanlty accelerated many research areas in biological sciences,such as protein biological function,drug screening and design,human health and disease,and protein rational design.Structural genomics(SG)aims to systematically solve representative structures of major protein-fold classes using high-throughput X-ray crystallography and NMR spectroscopy.However,only a small proportion(?10%)of selected proteins can be structurally solved.Therefore,to facillitate and accelerate the researches of SG and structural biology,in this thesis,we develop several bioinformatics tools to analyze and predict the likelihoods of protein structural determination,which provides a guide for scientists to select feasible proteins for structural determination.Addictionally,we also implement an user-friendly and academy-free Webserver for each tool.In Chapter 1,we review the research progresses of SG projects,and briefly introduce structural bioinformatics,inlcuding feature extraction,selection and machine learning(support vector machine).In Chapter 2,we designed a novel bioinformatics tool for the analysis,prediciton,and design of protein crystallizability,namely Crysalis.Compared with other tools,Crysalis as an integrated crystallization analysis tool provides several advantages and new functionalities,including:(1)Crysalis provides both high prediction performance and high computational effciency,which can be used to rapidly select of crystallizable proteins at proteome level;(2)Crysalis as the first-of-this-kind computational tool enables to identify site non-optimality for protein crystallization and design signle-point mutations for enhancing protein crystallizability;(3)Crysalis also provides annotation of target protein based on predicted structural propenties,including functional domain,coserved residues,predicted secondary structure,solvent accessibility for residues,and disorder.In Chapter 3,at the beginning,we review nine bioinformatics tools of protein crystallization prediction,which provide an academy-free and available source code or Webserver.We integrated the selected outputs from multiple predictors as input features to build a meta-predictor,CrysComb,and received a significantly higher predictive performance when compared to the individual predictor.Furthermore,we develop a new and accurate bioinformatics tool for protein crystallization prediction,namely Crysf.Crysf uses functional features extracted from UniProt as inputs to build prediction models.Compared with other tools,Crysf provides the best prediction performance and costs less time for computing each protein sequence.In Chapter 4,we develop a first-of-its-kind prediction tool,pNMRStr,to evaluate the the likelihood of yielding solution NMR structure for a given protein.We introduced a novel protein feature encoding method,called DDHoSPP,which discribes the distribution,difference and heterogeneity of sliding sequence segment-based physicochemcial propenties.The DDHoSPP-based feature set takes the most significant contribution to predict the likelihood of NMR structural determination,dramatically improving the prediction performance.Besides,we found that C-/N-terminal fusion His-tags could bring in potential influences on prediction results.Accordingly,we developed five versions of the pNMRStr predictor to realize accurate predictions for mixture,His-tag-free and His-tagged proteins,which was further validated on two real-life experimental cases.In Chapter 5,we summarize our works,and look forward to the future work.
Keywords/Search Tags:Protein structure determination, protein crystallography, NMR structure determination, bioinformatics prediction
PDF Full Text Request
Related items