Font Size: a A A

Unbalanced Data Classification Algorithm Based On SVM For Research And Application

Posted on:2012-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z J TongFull Text:PDF
GTID:2218330368982883Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Support vector machine algorithm is based on statistical learning theory a machine learning approach, because of its solid theoretical foundation and its complete theoretical derivation, Support vector machine is a study dealing with small samples, nonlinear, local issues such as minimum effective tool. Compareing with neural networks, Support vector machine has the convergence rate to be quick, Strong stability, and generalization ability, etc., so in the field of fault detection has a wide range of applications. In reality application because the failuer data not easy to obtain, therefore the examination data often is imbalanced. But the use tradition support vector machines algorithm processing imbalanced failure detection data question, the sorter classified effect is not very ideal, therefore very many scholars have made the improvement to the support vector machines algorithm.At present mainly concentrates in the data stratification plane to the support vector machines algorithm improvement in the minority kind of sample the sampling, but in the most kind of samples exists the noise and the redundancy can have not the good influence to the sorter. In view of this, this paper presents a new type of optimization will decrease less (ODR) due to sampling method for sampling algorithm, and with over-sampling the minority class of artificial boundary algorithm (BSMOTE) combined to achieve a balanced training sample data set. The method uses ODR on the under-sampling the majority class samples, removal of a large number of overlapping samples of redundancy and the noise samples, making reservations at the same time reduce the data under the more useful information; and had a small number of class samples Sampling is the boundary samples, and can be more conducive to follow-up of the SVM algorithm for classification.This topic first uses 5 group of different proportion imbalanced data sets in the UCI database to propose SVM to this article which unifies based on progressive optimized degression algorithm (ODR) and the BSMOTE algorithm the algorithm (ODR-BSMOTE-SVM) classified performance to carry on the test, and with existing has carried on the contrast test based on the minority kind of sample sampling improvement SVM algorithm. Finally use ODR-BSMOTE-SVM algorithm on the roller bearing failure detection dataset for testing, performance testing, and test the algorithm of important parameters on the algorithm detection performance implications, as well as testing algorithm in rolling bearings fault detection of generalization.
Keywords/Search Tags:Support Vector Machine, Unbalanced Data, BSMOTE, Fault detection
PDF Full Text Request
Related items