The problem of imbalanced data classification appears in many fields and is still not completely solved.In this paper,we consider the problem of binary classification with imbalanced data.Although this problem has been studied extensively in terms of the classification performance,the probability estimation of both majority class and minority class has not yet been well studied.In order to make precise class probability estimation as well as high performance of classification,we propose a new approach of regression with the Calibration Loss under the framework of generalized linear model,which results in a convex optimization problem.In this model,the generalized extreme value(GEV)distribution is adopted to form the asymmetric link function,which is the key role in binary classification with imbalanced data.Moreover,thanks to the Lipschitz continuity in GEV distribution,the running performance can be greatly boosted.As to the model estimation,because of the significant influence of the shape parameter on modeling precision,we studied two different methods to estimate the shape parameter of GEV distribution and discussed their pros and cons.Experiments on synthetic datasets proved the accuracy of the shape parameter estimation.In addition,experiments on real-world datasets showed that our proposed GEV regression,compared to other three commonly used regression algorithms,has a good classification performance as well as a precise class probability estimation.Besides,comparison with two other optimization methods also suggested a high computational efficiency in our algorithm. |