Font Size: a A A

Support Vector Machine And Its Application In Gene Expression Data

Posted on:2005-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:X T YangFull Text:PDF
GTID:2168360152969229Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Statistical learning theory is a theory of machine learning law dealing with small samples, and it takes into account the requirement of the generalization ability and the most excellent answer in limited conditions. Based on Statistical Learning Theory, a new machine learning method-support vector machine is put forward, and there are some virtues in dealing with the problem of pattern recognition, such as the problems of small samples, high dimensionality, non linearity.Microarray will bring about a revolution in biology and medicine, based on which the expression level of thousands of genes can be simultaneously observed, and in genome level, nature of life is studied by an approach of systematic and global idea. The data set has some traits, such as small samples, high dimensionality, non linearity, too. It is a challenge for some traditional machine learning methods, and the analytical method is a hot topic in bioinformatics now.Theory and method of support vector machine are studied in typical gene expression data sets. In theory ,several key issues in statistical learning theory and support vector machine are discussed; a training algorithm of support vector machine-sequential minimal optimization is given, and a improved version of it is put forward in unequal sample number between classes, the method is proved to be feasible through simulative data; a software—GeneSVM based on support vector machine for gene expression data is achieved and a classifier model is brought forward: first, gene subset is extracted by the method of signal to noise ratio ;then data set is normalized by min-max method; in the end a classifier is built by the method of SVM, a classifier model based on above theories is successfully applied in several gene expression data sets.
Keywords/Search Tags:Bioinformatics, Gene Expression Data, Statistical Learning Theory, Support Vector Machine, Sequential Minimal Optimization
PDF Full Text Request
Related items