Font Size: a A A

Personal Credit Risk Assessment Based On Data Mining Combination Model

Posted on:2024-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y MuFull Text:PDF
GTID:2530307079491444Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The difficulty points of the problem of personal credit risk assessment in consumer credit scenarios focus on solving the problems of high dimensionality of data features,data sparsity,imbalance of sample categories,and performance of using models to assess personal credit risk.In this paper,the PRMK-S5-DS prediction framework is proposed to address the problem of assessing individual credit risk in consumer credit scenarios.Firstly,based on the basic data processing and feature engineering,the PCA feature reduction algorithm and REF_MRL1 feature selection algorithm are used to convert high-dimensional features into low-dimensional features and retain the original data information to the maximum extent,considering that too high a feature dimensionality will cause data redundancy.Secondly,considering that a very small number of defaulted samples would lead to low prediction accuracy of the machine learning model for defaulted samples,the SMOTE_5 sampling method was used to balance the number of defaulted samples and non-defaulted samples in the sample,while the KIL2 algorithm was used to label the samples with anomalies.Finally,the DXL_SVM model is used as the classifier to give the final prediction results for the test set samples.The DXL_SVM model combines four base classifiers,namely decision tree,XGBoost,Light GBM and SVM,which can effectively attenuate the shortcomings of individual classifiers and has superior performance.The PRMK-S5-DS prediction framework was found to be superior to the decision tree,XGBoost and Light GBM models in terms of auc,Accuracy,Recall,precision and F1-score,using data from a sample of consumer credit companies.The prediction framework is validated and successfully used in personal credit assessment.
Keywords/Search Tags:data mining, data imbalance, credit assessment, feature derivation, anomaly detection
PDF Full Text Request
Related items