| The research of protein folding is an important issue in the field of molecularbiology, cell biology and drug design. And fold classification is the basis of theprotein-folding study. With the increasing number of protein structures in PDBdatabase, the issue of fold classification is becoming increasingly important.Alpha/beta protein is common in nature, and its family or superfamily is the mostcomplex one in all families or superfamilies. Based on LIFCA database, this paperpresents a method of protein folding type classification for Alpha/beta classification.The study includes the following aspects:1. Establishment of template databaseThe experimental data set was derived from LIFCA database. We selected55kinds of folding type of alpha/beta protein whose number of samples were more thantwo. Then we got931experimental samples. Combined with the definition of proteinfolding type and its topological structure characteristics, we determined one templatefrom each kind of folding type, and extracted the corresponding characteristicparameters of these templates from DSSP database. At this point, we established onetemplate database.2. Establishment and evaluation of multiple-templates classification methodBased on TM-align, we built a classification method---Multi-Fscore. We used931proteins as research objects. The average specificity, average sensitivity and MCCvalues are99.58%,79.47%and79.39%, respectively. Compared with the results ofTM-align, we found that the sensitivity and MCC values of our method are slightlybetter than the results of TM-align, and the average specificity is quite similar. Theseresults show that our classification method can realize the automatic classification ofalpha/beta protein.After that, we selected all alpha and all beta protein folding types in LIFCAdatabase as research objects, and used Multi-Fscore for the1380proteins, the numberof proteins account for97.55%,99.42%and99.89%of the total with the valuedistribution in (0,0.6],(0,0.7] and (0,0.8]. However, the distribution ratio of931proteins are2.79%,5.59%and10.63%, respectively. The results show thatMulti-Fscore has its accuracy and specificity.3. The research of single-template classification methodBased on the classification of SCOP1.75c,79proteins of Rossmann folding typewhose sequence similarity were less than25%were selected as experimental set.Then we used ROC curve to determine the optimum threshold with2362proteins inLIFCA database who are not belong to Rossmann folding type and the79proteins.Using this threshold as Standard, we did a testing with the experimental set that established in the chapter two. Compared with TM-score, the results of sensitivity aremuch better than TM-score, and the results of specificity are slightly worse thanTM-score.In conclusion,the alpha/beta protein fold type classification method can not onlyguide the research of other folding types, but also lay a foundation for the automaticclassification of protein folding types. |