Font Size: a A A

Statistical Analysis And Prediction Of Protein Contact Order And Folding Rate

Posted on:2018-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:H LiangFull Text:PDF
GTID:2310330536479437Subject:Statistics
Abstract/Summary:PDF Full Text Request
Protein is a biological macromole,which is an important carrier of biological function.Proteins can spontaneously fold to their unique stable natural states from unstable denatured states quickly and accurately,but the folding mechanism is still unclear.Understanding protein how to fold into its natural protein conformation quickly is one of the important research contents of molecular biology.To achieve the accurate prediction of protein folding rates,and then give the main factors influencing the protein folding is one of the main methods to explore the mechanism of protein folding.In recent years,many researchers conducted a lot of researches to explore the determinants of protein folding rates,and all kinds of forecasting methods and models were put forward.Assuming that folding environment roughly on the the same condition,the present studies showed that the factors influencing the protein folding rates are: protein size,topology structure and amino acid composition.Protein size and topology structure are the main factors,but on the current accumulated experimental data of folding rate,based on the amino acid composition was not enough to give the accurate predicted model of protein folding rate.Protein size can be represented by the protein chain length(i.e.,the number of amino acid residues of sequence),and topology structure is mainly represented by contact order.Contact order is defined as the average sequence interval between each pair of contact residues.A large protein contact order indicates more non-local contact,protein structure is relatively loose;While a small protein contact order indicates more local contact,protein structure is more compact.The research of protein contact order is important to predict protein folding rate and is also an important part of the prediction of protein 3D structure.As a result,the statistical analysis of protein contact order has a significant meaning for studying protein folding.Statistical results on the data set of 752 proteins show that the protein absolute contact order is concerned with the protein size and protein shape.Correlation coefficient between absolute contact order and protein chain length is 0.76,the correlation coefficient with equivalent radius of gyration is-0.71.Based on these analyses,we use the ratio of chain length to radius of gyration as a new parameter concerned with protein absolute contact order-Ratio of chain length to radius of gyration.Statistical results show that the correlation coefficient between ratio of chain length to radius of gyration and absolute contact order is 0.83,which shows that this parameter is a determinant ofprotein absolute contact order.In addition,there is also a good correlation between protein cumulative backbone torsion angle and protein absolute contact order.Next,we statistics analyze the relevant factors influencing the protein folding rate.By the prediction of backbone torsion angle starting from the amino acid sequence,cumulative backbone torsion angle can be calculated,on this basis can establish prediction model of protein folding rate based on the amino acid sequence,and on the current data set includes experimental value of 100 protein folding rates,the correlation coefficient reached 79% between the predicted cumulative backbone torsion angle and folding rate,and the result is better than that of the existing model,such as the model of chain length,the model based on effective chain length,contact order model and the number of long-range contact.In addition,we also calculate the correlation of ratio of chain length to radius of gyration and ratio of cumulative backbone torsion angle to radius of gyration with the folding rate respectively,the results show that compared with the chain length,the correlation is higher between ratio of chain length to radius of gyration and protein folding rate,especially significantly improved the correlation with two-state protein folding rate.While ratio of cumulative backbone torsion angle to radius of gyration largely improve the correlation with multistate protein folding rate.Finally,we mix the protein size(represented by cumulative backbone torsion angle),shape and structural topology together,using support vector regression to give a prediction model of protein folding rate,the correlation coefficient is 0.797 between Jackknife test predicted values and experimental values,and the average absolute error is1.89,which achieved the best prediction result on the current data set.The results show that under the premise of ignoring considering environmental factors,protein folding rate is determined by many factors,of which the pritein size,shape and topology structure may play a main role intertwining together.
Keywords/Search Tags:Protein, Folding rates, Prediction, Cumulative backbone torsion angles, Absolute contact order
PDF Full Text Request
Related items