Font Size: a A A

Research Of Protein Functions Prediction Based On Compressive Sensing

Posted on:2015-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:X N ShiFull Text:PDF
GTID:2268330425989084Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of modern biological science and computer technology, the amount of protein sequence data is growing rapidly recent years. Prediction of protein functions based on primary sequence is a hot topic in bioinformatics research. Compressed sensing (CS) algorithm has been developed rapidly since its proposal. It has been successfully applied in many fields such as image processing, pattern recognition and so on. Compressed sensing (CS) has achieved very good results in classification. According to sparse characteristic such as small sample properties and high dimensionality of protein data, we performed a classification prediction of protein functions based on compressed sensing algorithm. Classification models based on compressed sensing algorithm can avoid complex features-extraction process. In this paper, different samples are divided into training sets and test sets. Firstly, we constructed redundant dictionary by using the training sets data, and then chose random matrix with Gaussian distribution to construct measurement matrix. Secondly, using l2-norm optimization solution to reconstruct signals. At last, we estimated test sample category according to the sparse representation. The main contents are as follows:1、Apoptosis protein subcellular site prediction model is established. According to compressed sensing algorithm, we implemented prediction of apoptosis protein subcellular site by inputting the structural, physical and chemical characteristics of the protein sequence into the classifier. We chose two commonly apoptosis protein dataset ZD98and ZW225,with jackknife test, the overall accuracies on the two datasets reached90.6%and87.3%. The results showed that predicting subcellular site achieves good classification efficiency by compressed sensing algorithm.2、Protein mass spectrometry data classification prediction model is established. Public ovarian cancer data set Ovarian04-03-02is used to predict classification, the overall accuracy by5-fold cross validation is up to99.38%. The results showed that compressed sensing algorithm applied in the classification of protein mass spectrometry data has a good classification performance and robustness. CS can be used in the classification of protein mass and has a great value of clinical application.
Keywords/Search Tags:Compressive sensing (CS), sparse representation, apoptosis protein, subcellular locations, over complete dictionary
PDF Full Text Request
Related items