Font Size: a A A

Applications Of Classification Algorithm In Bioinformatics

Posted on:2014-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ChenFull Text:PDF
GTID:2250330401974766Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the exponential growth of bioinformatics data, exploring the useful information from these massive data has become one of the most urgent problems in the research of bioinformatics. The main research content of this paper is the gene expression profiles and signal peptides. We find more effective classification algorithm via experimental research and extend application space of the classification algorithms.Cancer diagnosis based on gene expression is expected to become a fast and effective clinical diagnosis. However, it is difficult to classify them in the high dimension, small samples and noise characteristics of gene expression profile data. Therefore it is necessary to find a more feasible and effective classification method. The improved tumor gene expression profile classification model based on Bayesian classifier is proposed in this paper. By using Bayesian network toolbox of MATLAB, we carry out the experiments which used colon cancer gene expression profiles as the experimental data and4-fold cross-validation to test identification accuracy. Experimental results show that the method is feasible and effective.The signal peptide is a short peptide chain that directs the transport of a protein and has become the crucial vehicle in finding new drugs or reprogramming cells for gene therapy. However, since the avalanche of new protein sequences is generated in the post-genomic Era, identifying new signal sequences has become an important task in biomedical engineering and a challenge for us. In this paper, we propose a novel predictor called Signal-BNF to predict the N-terminal signal peptide as well as its cleavage site based on Bayesian reasoning network. Signal-BNF is formed by fusing the results of different Bayesian classifiers which used different feature datasets as its input through weighted voting system. Meanwhile, by using Bayesian network toolbox of MATLAB, we carry out the experiments which used protein sequences of six different species as the experimental data and5-fold cross-validation to test identification accuracy, and we get higher prediction accuracy.
Keywords/Search Tags:Bayesian classifier, Gene expression pronle, Signal peptide, Signal-BNF, MATLAB, K-fold cross validation
PDF Full Text Request
Related items