Font Size: a A A

Protein Secondary Structure Prediction Based On SVM

Posted on:2016-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y M WuFull Text:PDF
GTID:2180330467973338Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
As the explosion of biological data, the number of biological sequences has grownexponentially in the database. Therefore, the prediction of protein structure and function from theamino acid sequence has become an important research problem. Identifying the structure ofprotein directly by experiment is relatively inefficient while the protein secondary structureprediction provides a new way. This paper focuses on the coding scheme and SVM kernelfunction to study the protein secondary structure prediction. Specific research works are asfollows:(1)A novel method to predict protein secondary structure was provided based on thestructural characteristics. Firstly, the mainly effect factors was extracted using the principalcomponent analysis from physical and chemical properties of amino acids, which was thencomposed into the three coding. Secondly, three propensity factors of amino acids of the specificsecondary structure were added into the three coding above. Finally, the protein secondarystructure can be predicted with the support vector machine after the coding is completed. Theresults show that our approach is better than three or five coding by using the principalcomponent analysis only, which can be applied for the protein secondary structure predictioneffectively.(2)For the SVM algorithm in the protein structure prediction, a new kernel function wasconstructed to improve the robustness and generalization ability of the model, which was calledtriangular Laguerre kernel function according to the Laguerre orthogonal polynomial. Thentriangular Laguerre kernel function was compared with other functions such as RBF kernelfunction, Laguerre kernel function based on Gauss. We find that new kernel function is moreeffective.All in all, this paper proposes an improved coding scheme and kernel function to predict thesecondary structure for protein. The results show that our method is reasonable.
Keywords/Search Tags:Protein secondary structure, Coding scheme, Principal component analysis, Tendency factor, Support vector machine, Kernel function
PDF Full Text Request
Related items