Font Size: a A A

Protein Binding Free Energy Prediction Based On Sequence And Structure Features

Posted on:2018-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:B L LuFull Text:PDF
GTID:2310330542483636Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Protein is the basic material of life and constitutes the basic organic compound of cell and the main undertaker of life activities.Protein is not only combined with other compounds,but also is critical in protein binding.The protein and protein interaction plays an important role in life activities.If the value of the binding free energy between a pair of proteins is very large,the protein pairs can be successfully combined in the drug design stage and have a good effect on the organism.Further,it is almost meaningless to combine them as drugs.Regression analysis is a widely used method for data analysis,which is good at using the inherent laws of observation data to analyze the dependencies between variables,especially in quantitative prediction has many applications.It is a feasible way to establish an accurate regression model to predict the binding free energy of proteins.Therefore,it is the focus of this paper to select valuable feature sets and regression models.According to proteins features,predicting the free energy can improve the protein interactions between proteins,promote the research progress of protein docking,accelerate the design and development of drugs targeting protein-protein interactions,plays an important role in the efficient treatment of diseases.There are a lot of methods to calculate the binding free energy of protein,but these methods require a lot of time and resources,and cannot achieve a high accuracy,which is difficult to be directly applied to the practice.The aim of this paper is to design an accurate and fast computational model to predict protein binding free energy.The main research work is as follows:(1)Collecting and calculating the sequence and structure features which are related with the binding free energy of protein,135 protein complexes as the training set,and the 39 pairs of protein complexes as external set.(2)Using the minimum redundancy maximum correlation(mRMR)to select the characteristics which are related with protein free energy and remove the redundant features,thereby get the minimum redundancy maximum correlation feature set.Then the feature set was used to establish 6 regression models(3)For the six regression models,the best regression model is obtained by 10 fold cross validation,then the best feature set is obtained by model feature optimization,and the optimized feature set be used to analyze the importance of features by removing features.(4)The optimal feature set was used to build the best regression models to predict the binding free energy of the protein,and the performance of the model prediction is compared with other methods on conformational changes and external set validation.The experimental result shows that the Linear Regression and SMOreg regression model were combined to predict protein binding free energy.The best regression model is obtained after optimization has higher performance than other methods,and also is suitable for the high conformational changes proteins.
Keywords/Search Tags:Interaction, Free energy, Feature selection, Regression model
PDF Full Text Request
Related items