Font Size: a A A

Support Vector Machine---A new model and its application

Posted on:2011-08-19Degree:Ph.DType:Dissertation
University:Arizona State UniversityCandidate:Chen, Wang-JuhFull Text:PDF
GTID:1448390002963616Subject:Applied Mathematics
Abstract/Summary:
This dissertation studies a proposed formulation of the Support Vector Machine (SVM). It is based on the development of ideas from the method of total least squares, in which assumed errors in measured data (errors-in-features) are incorporated in the model design. For example, genetic data measured from micorarrays are noise contaminated. Also for genetic data, the number of features is far greater than the sample size because of not only the high cost of the experiment but also the requirement of collecting patients with the necessary conditions. Traditional classification methods cannot be applied directly due to the "curse of dimensionality", which is a problem caused by high dimensionality of the feature space (parameters) with not enough observations to get good estimates. SVM-based algorithms, however, which employ dual methods and the use of a data mapping kernel, have the potential to overcome this difficulty. The new method is based on introducing Lagrange multipliers to solve for the dual variables. Instead of finding the optimal value of the Lagrange function, the nonlinear system of equations obtained from the Karush-Kuhn-Tucker (KKT) conditions is solved. Moreover, complementarity constraints and weighting of the linear system by the inverse covariance matrix of the measured data are also implemented. To improve accuracy of the classification, regularization for the ill-posed linear problem which arises during calculation is introduced. Some other aspects of improving the algorithms are also considered such as choosing the initial point and methods to avoid over-fitting.;The proposed algorithm is applied to several public microarray data sets and Positron Emission Tomography (PET) images. The results indicate that the proposed algorithm is competitive with the standard SVM and performs better in some cases. It also succeeds when applied to the dot-product data mapping in the kernel, hence demonstrating the ability of classifying the data sets with millions of features, i.e. PET images, which is classically incredibly difficult. The algorithm demonstrates a better ability to classify data sets even when there exists errors in features and gives improved results and higher sensitivity for classifying a set of Alzheimer's Disease (AD) PET images.;In addition to the development of the new model, the applications of the SVM for structured classification and statistical power analysis using PET images are also evaluated. The results from these two types of problems further confirm the use of the SVM on high dimensional data sets.
Keywords/Search Tags:SVM, Data, PET images, Model, New
Related items