Research On The Influence Of Parameter Selection In Kernel Transformation Method On The Classification Of Imbalanced Sample Dat

Posted on:2024-02-10

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhao

Full Text:PDF

GTID:2557306917973039

Subject:Statistics

Abstract/Summary:

In recent years,the problem of classification of imbalanced data sets has attracted extensive attention and research.Cost-sensitive support vector machines(CS-SVM)have a good generalisation performance in this problem.The penalty parameters and kernel parameters of CS-SVM have a great impact on the classification performance of the model.In particular,in the classification of imbalanced data,the selection of model usually involves parameter tuning and optimisation.However,the selection of these parameters is used as a ’black box’ without understanding the details.This paper analyses the behaviour of CS-SVM when the parameters of the penalty factor and kernel function are taken to different values in imbalanced data sets.By theoretical derivation and data analysis,the search space for parameters is divided into three regions.In good regions,effective parameters can be quickly searched and the model performs well.The range of kernel function parameters can be expected by calculating the distance between samples.And based on this,a new CS-SVM parameter optimisation method is proposed.This method can reduce the search space of parameters and the computational effort decreases exponentially with increasing complexity of the dataset.And the running time is significantly reduced compared to the global search.In addition,this paper analyses and discusses the performance of several commonly used classification models in the classification of imbalanced data.In order to make the separation effect more inclined to improve the recall of the minority class,a new evaluation metric based on G-mean is proposed in this paper to evaluate the proposed method.The results show that the proposed method in this paper can increase the recall of the minority class to 1 while ensuring sample accuracy.The proposed method has higher search efficiency and better average performance compared to other classification models.

Keywords/Search Tags:

Imbalanced datasets, Kernel function, CS-SVM, Recall

Related items

1	Research On Classification Of Imbalanced Datasets Based On Random Forest
2	Research Of SVM Kernel Functions In Text Classification
3	Research On Support Vector Machine And Decision Tree Algorithm For Imbalanced Datasets
4	KNN Algorithm Based On Gaussian Kernel And Its Applications
5	An Optimized Deep Reinforcement Learning-Based Model For Image Classification In Imbalanced Datasets
6	Research And Application Of Label Learning Based On Mixture Kernel Extreme Learning Machine
7	Local Linear Smoothers Using Lognormal Kernel And Birnbaum-Saunders Kernel
8	College Instructor Academic Warning Management Mode And Its Application Based On Kernel Principal Component Analysis
9	Research On Imbalanced Classification Model In User Churn Identification
10	Missing Multi-label Learning Of Imbalanced With Label Reconstruction