Font Size: a A A

Support Vector Machine Approach For Protein Mesophilic & Thermophilic Recognition And Protein Subcellular Localization Prediction

Posted on:2008-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z H QianFull Text:PDF
GTID:2120360242974201Subject:Biophysics
Abstract/Summary:PDF Full Text Request
In the post-genome era, the protein pattern recognition is becoming an important research domain in the life science. Prediction of mesophilic & thermophilic proteins recognition and subcellular location are still the challenges at present in the research of the protein pattern recognition. Based on the relationship of protein's structure and function, a method came to be true which considering the component and correlation of the amino acid sequence in our studies. Meanwhile, the support vector machine (SVM) was introduced to predict and had good predicted results. New breakthrough in this research would be helpful to knowing better the folding mechanism and the function of protein. What is more, it would be an important assistant to relevant industries such as biomedical engineering, ag-biotech, etc.This results as follows.The first chapter involved the introduction. The research background and the current development about protein mesophilic & thermophilic properties and protein subcellular location were presented in this chapter. Subsequently, a new machine learning------the support vector machine was specifically recommended.The second chapter involved the recognition of mesophilic & thermophilic protein. In this experiment, we made an attempt to propose a novel feature extraction approach with component and correlation of amino acid. According to our new approach, 76 pairs of mesophilic & thermophilic proteins were trained and modeled, and 20 pairs of ones were tested using independent testing. The forecasting results showed the precision were 85% and 80%, respectively. The recognition precision had increased a lot comparing the highest precision of PCA, PLS and PC-ANN offered by zhang et.al.The third chapter involved protein subcellular localization prediction. In thisexperiment, 996 cytoplasmic, extracellular and periplasmic sequences from prokaryote were trained and modeled according to the above approaches, and then were tested using jackknife testing and 10-fold cross validation testing. The forecasting results indicated the precision were 93.57% and 93.47%, respectively. Comparing to the highest precision as we all known, the prediction precision had increased to a certain extent yet.
Keywords/Search Tags:Multi-scale component and correlation, Support vector machine, Pattern recognition, Protein thermostability, Subcellular localization
PDF Full Text Request
Related items