Font Size: a A A

Research Of Imbalanced Data Classification Based On GEV Distribution

Posted on:2018-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:H L ZhangFull Text:PDF
GTID:2428330590977756Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The problem of imbalanced data classification appears in many fields and is still not completely solved.In this paper,we consider the problem of binary classification with imbalanced data.Although this problem has been studied extensively in terms of the classification performance,the probability estimation of both majority class and minority class has not yet been well studied.In order to make precise class probability estimation as well as high performance of classification,we propose a new approach of regression with the Calibration Loss under the framework of generalized linear model,which results in a convex optimization problem.In this model,the generalized extreme value(GEV)distribution is adopted to form the asymmetric link function,which is the key role in binary classification with imbalanced data.Moreover,thanks to the Lipschitz continuity in GEV distribution,the running performance can be greatly boosted.As to the model estimation,because of the significant influence of the shape parameter on modeling precision,we studied two different methods to estimate the shape parameter of GEV distribution and discussed their pros and cons.Experiments on synthetic datasets proved the accuracy of the shape parameter estimation.In addition,experiments on real-world datasets showed that our proposed GEV regression,compared to other three commonly used regression algorithms,has a good classification performance as well as a precise class probability estimation.Besides,comparison with two other optimization methods also suggested a high computational efficiency in our algorithm.
Keywords/Search Tags:linear model, extreme value distribution, imbalanced data, classification, probability estimation
PDF Full Text Request
Related items