Font Size: a A A

Research And Application Of Non-convex Online Support Vector Machines

Posted on:2014-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LiFull Text:PDF
GTID:2268330392473523Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Support Vector Machines (SVM) is a machine learning algorithm based on theVC dimension theory and Structural Risk Minimization principle of statisticallearning theory, which is an effective solution to the pattern recognition problems ofsmall sample, nonlinear and high dimension. SVM can overcome the problem of the―curse of dimensionality‖and―over-fitting‖, and it has better generalizationperformance, so it is widely used to solving the problem of pattern recognition.Although the SVM shows excellent performance in solving classification problems,the SVM training convergence speed is slow, and it requires a high cost of storageresource and computation in the solution of large-scale data sets. In the classificationproblem of imbalanced data sets and noise data sets, SVM suffers the same challengesas the traditional machine learning algorithms.The main content of this paper is to study the non-convex online support vectormachine algorithm and its application in text classification.Firstly, we introduce the core content of the basic problems of machine learning,statistical learning theory, the support vector machine theory in detail, and theimbalanced data classification problem using SVM.Secondly, we study the non-convex online support vector machine algorithm.We implement the algorithm on the basis of LASVM code. With the different of SVMbased on the Hinge loss function which constructs convex optimization model toiteratively solve the problem by batch processing, non-convex online support vectormachine based on the Ramp loss function which constructs non-convex optimizationmodel to iteratively solve the problem by online learning. It costs less training timeand computing resource to achieve, classification model with comparable or evenbetter generalization performance and sparser support vector at the same time.Non-convex online support vector machine has the strong ability of anti-outlierinterference when ten noisy datasets contains a lot of wrong class label, and it also hasthe ability to handle large datasets classification problem. This paper also points outits inadequate in imbalanced classification datasets, and we propose an imbalancednon-convex online support vector machine based on different misclassificationpenalty cost idea.Finally, we research the application of non-convex online support vector machinein text classification. We analyze the common feature weighting method, andintroduce a novel and stable feature weighting method of term frequency relevance frequency. We design and implement a text categorization program. The paperproposes a text classification method based on non-convex online support vectormachine and term frequency relevance frequency product. We analyze and compareclassification the performance of the non-convex online support vector machinecombine with the different feature weighting methods in our experiment, especiallythe text classification performance in noisy datasets and large-scale dataset. Themethod has excellent performance in text classification, and it shows very obviousadvantage for text classification.
Keywords/Search Tags:Support Vector Machines, Non-convex Online Support Vector Machines, Term Weighting, Text Categorization
PDF Full Text Request
Related items