Font Size: a A A

Research On Robust Large Margin Classification Learning

Posted on:2014-04-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:W PanFull Text:PDF
GTID:1228330422990843Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the field of machine learning, margin depict the classification confidence from the perspective of distance and can be used to evaluate the generalization error bound of learning algorithms and also to instruct the design of classification algorithms and it is widely applied to instruct feature selection, classifier training and ensemble learning. However, many of large margin classification algorithms have flaws in their bad perfor-mances on noise resistance. When data with huge noises exists in training samples, the impacted classification boundary will deviate from the right place. In this paper, to solve the above problem, from the perspective of feature selection and classifier construction, we put forward margin based on robust feature selection, support vector machine train-ing algorithm and multiple classifiers ensemble method, which improve the robustness of KNN (K nearest neighbor) and support vector machine for classification, seeing details about the research are as follows:(1) In the traditional feature selection via large margin nearest neighbor, classifica-tion margin is required to instruct distance learning of target neighborhood which contains samples from different classes, and the margin is calculated by the nearest neighborhood rule, when multiple noise samples exist in target neighborhood, calculation of the clas-sification margin is not robust. To solve these problems, this paper proposes a feature selection method based on robust margin statistics, it first finds the target neighborhood that contains samples from different classes. Then in this area, it calculates distance be-tween the center sample point of the neighborhood and all samples within the same class, and distance between the center and all samples within different classes, and uses the me-dian of classification margin as optimization objective to instruct feature weight learning and improve classification robustness of nearest neighbor classification.(2) According to existing feature selection methods based on classification margin loss, values of their loss function are extremely huge if data with tremendous noise exists, thus it will make solution of the objective function change with noises, and consequently the robustness of the algorithm decreases. According to the issue, this paper introduces a robust loss function (Brownboost loss) to establish the optimization objective, because of the non-convexity of Brownboost loss, and we put forward feature weights learning al-gorithms based on gradient descent combining normalization techniques, which improve anti-noise ability to the support vector machine.(3) Due to robust loss function (Ramp Loss) is the existence of non convexity in robust support vector machine training algorithm, leads to it has higher training time complexity and classification robustness need the further improvement. To the above problems, this paper raises a training algorithm for robust support vector machine which is based on smooth truncated loss.It first uses smooth approximation technique for Ramp Loss and turns it into the sum of a smooth convex function and a smooth concave function, then uses CCCP(Concave-Convex Procedure) to solve the problem, finally implement Newton gradient descent technique to realize fast linear and nonlinear learning, which consequently increases training speed and improves robustness.(4) Due to the fault diagnosis data with higher dimensionality, much more sample quantity, class imbalance and widely contain noise, simple classification learning method-s cannot satisfy the demand of anti-noise performance, this paper presents margin based robust classification ensemble model, it fuses different anti-noise technology at different stages. This model partitions the training procedure into four phases, including random-ized sampling phase, feature selection phase, base classifier learning phase, weighted voting phase. Then, during classifier fusion procedure, we apply quadratic loss and L1normalization technique to learn sparse weights of the base classifier which can be used for predicting the classification result, and so thus to improve the robustness of classifica-tion prediction.
Keywords/Search Tags:margin, robustness, feature selection, nearest neighbor classification, supportvector machine, ensemble learning
PDF Full Text Request
Related items