| With the economic development and the rapid rise of "Internet +" finance,the concept of advanced consumption has been more and more accepted by the public.Residents are pursuing a higher quality of life,and their demand for large goods such as houses and vehicles is increasing.According to statistics,by the end of 2018,the total amount of personal loans accounted for 29.45 percent of the annual RMB loans,while the balance of non-performing loans of personal loans reached 2030 billion yuan,reflecting the high risk of personal loan demand.For the sake of healthy economic development,financial institutions should avoid risks in time and correctly evaluate the credit of borrowers.In practical problems,credit data are affected by different countries,regions and economic cultures and have certain particularity.The existing credit evaluation methods are more suitable for specific application scenarios and data sets and have poor robustness.This paper proposes a personal loan credit evaluation model based on portfolio classification strategy.The evaluation model consists of two parts: feature selection and combinatorial classification:First,feature selection is divided into two stages: the first stage is feature filtering.The mutual information value of each feature and category is calculated and sorted from large to small.In the second stage,the features are grouped,the mutual information values between two features are calculated,and sorted from large to small,and the grouping threshold is set.The features larger than the grouping threshold are divided into a group,and then the features with the largest correlation degree with the category in each group are selected as the final feature subset.Ensure that the feature subset is highly correlated with the category,and the redundancy between features is small;Second,choose typical of credit evaluation problem of support vector machine(SVM),logistic regression,random forests and K-nearest neighbor of the four algorithms as the base classifier combination classification,respectively,according to the base classifier on the accuracy of the data set and stability calculation decision points,the base classifier for the new customer credit stand or fall of evaluation,the credit evaluation score for the decision of a base classifier,and compare the size,get the final new customer credit quality.Combined classification quantifies the effect of each base classifier in the data set,improves the role of more suitable base classifier in the final decision,and reduces the misjudgment rate.In addition,in order to solve the problem of unbalanced sample categories in real data,SMOTE sampling is adopted in this paper to expand the categories with fewer samples.To verify the validity of the proposed credit evaluation method,this paper carried out SMOTE sampling and regular sampling experiments,improvement of traditional mutual information and mutual information comparative experiments and combined classifier and single classifier contrast experiment on three real credit UCI data sets,the experimental results show that the method of this paper to different real data set with accuracy and robustness,suitable for actual application problem. |