Font Size: a A A

Research Into Imbalanced Datasets Algorithm Based On Asymmetric Weighted Method And Kernel Method

Posted on:2014-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:S J ZhaoFull Text:PDF
GTID:2248330395983828Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Learning from imbalanced datasets is an important supervised learning problem, which hasbeen received much attention in recent years. Various important applications demonstrate thischaracteristic of class imbalance, such as network intrusion detection, information retrieval, medicaldiagnose, genetic analysis, and so on. Because of the large difference between the two classes, usingthe traditional classification methods for effective classification of imbalanced dataset could not beachieved. Therefore, imbalanced classification data to the current machine learning poses a hugechallenge.In order to handle the class imbalanced learning problem, this paper aims to improve theminority class classification accuracy and the overall classification accuracy, as well as reduce thenumber of support vectors. The main work is as follows.1, Two asymmetric weighted methods to handle the class imbalanced datasets are proposed.The linear form and exponential form of dual fuzzy membership function with asymmetricweighted algorithm based on feature space are given. The different importance of the two classesand the different memberships of the data points are take into account. Experimental results showthat the classification accuracy of the minority class and the overall accuracy of the classifier havebeen improved.2, A mixed kernel function with the polynomial kernel function and the RBF kernel function isconstructed, and then the linear form and exponential form of dual fuzzy membership function withasymmetric weighted algorithm based on the mixed kernel function are proposed. Experimentalresults show that the algorithm effectively reduces the number of the support vectors and improvesthe G-means and AUC values.3, The free parameter of the conformal transformation matrix in the feature space is changed. Amodified Riemannian metric kernel function and the asymmetric weighted algorithm based on thedistance of the points to the actual hyperplane is proposed. Then the modified Riemannian metrickernel function is used to the asymmetric weighted algorithm. Experimental results show that themethod effectly improves the G-means and AUC values.
Keywords/Search Tags:imbalanced datasets, asymmetric weighted method, feature space, dual membership function, kernel method, Riemannian metric
PDF Full Text Request
Related items