The Application Of Support Vector Machine(SVM) In DNA Data Analysis Research

Posted on:2016-08-23

Degree:Master

Type:Thesis

Country:China

Candidate:X F Liu

Full Text:PDF

GTID:2180330470968924

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

Statistical learning theory is gradually mature in the 1990 s as a comparatively perfect theory of Machine learning. Compared with the previous Machine learning, Support Vector Machine(Support Vector Machine, SVM) basing on the theory can better solve the problem of small sample learning, have good robustness and low operation cost. Implementing the theory, support vector machine(SVM) algorithm has become an important tool for machine learning and knowledge mining.Bioinformatics is a cross subject combining life science, mathematics, computer science and other disciplines, and the DNA sequence is the typical type of data in bioinformatics. With the opening and smooth completing of human genome project, the development of DNA sequence analysis is powerful promoted. The research of DNA sequence data connotation is one of the most important subject of post genome era. Finding the rule of some characteristic fragment in the DNA sequence of life science and human genetics has very important significance.This paper uses the SVM algorithm for DNA sequence classification experiment. Firstly, by the sliding window method extracting features from the classification of known DNA sequence will feature sequences generated from feature matrix vector as input vector. Then using R language software to achieve a DNA sequence classification process based on support vector machine(SVM).R is used to implement the first call in class package, recycling network search method and 10 fold cross-validation to find the optimal parameters, the range need to loosen if a parameter is given in the scope of the boundary in the optimization process to find the optimal parameters to construct the SVM model. Use a variety of kernel function for classification experiment and finally select the optimal kernel function by statistical analysis. The good effect of classification of the SVM classifier used in the paper can be applied to the actual DNA data classification, and has certain generalization performance. The algorithm can also be extended to the multiple classification problems.

Keywords/Search Tags:

support vector machine(SVM) classification, DNA sequence, feature vector, kernel function, structural risk minimization

PDF Full Text Request

Related items

1	The Researches On Support Vector Machine Classification And Regression Methods
2	Support Vector Machine Base On Reduce White Noise Theory
3	The Differential Geometry Method Of Support Vector Regression Machines
4	Research Of Multi-class Classification Methods Based On Support Vector Machine
5	Research Remote Image Classification Based On Support Vector Machine
6	The Study On Selection Of Kernel Function In Support Vector Machine
7	The Method Of Modifying Polynomial Kernel Function In SVM And Research Of Exon- Intron Feature Sequence
8	Classification Of Protein By Using Support Vector Machine
9	Robustness Of Fuzzy Classification And Fuzzy RMM One-class Classifier
10	Classification Of Amanita Model Based On Support Vector Machine With Mixed Kernel Function