Font Size: a A A

Research On Feature Engineering For Credit Risk Prediction

Posted on:2020-09-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2518306518461684Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the global digital currency transformation trend and the deep integration of Internet technology and financial industry,the trend of digital wind control is becoming more and more obvious.Faced with the massive data accumulated in the "big data era",the use of big data thinking to analyze and process large amounts of data has also raised higher requirements and challenges for all walks of life.Credit risk management control is the core of the financial industry.At the same time,with the rise of artificial intelligence technology,how financial institutions use machine learning,feature engineering and other technologies to comprehensively portray business scenarios and conduct more accurate risk prediction has become a hot topic.problem.The credit risk prediction problem is essentially a classification problem.Combined with the characteristics of financial credit industry data,most of them are structured data.Thesis sorts out existing literature and methods,and adopts feature engineering,machine learning and other technologies.Research trials,the specific research contributions are summarized as the following two points:1.Specific feature engineering construction methods and processes are proposed for credit risk prediction.The specific logic is:for structured data,abstract the data table into feature entities,and propose the basic method of feature construction for single entity aggregation and extension.At the same time,it also expounds the connection between different entities,and proposes the basic features,aggregation features,transformation features,time series features,combination features,business features six dimensions for feature construction methods and specific operations,and finally use the enterprise's real structured data for feature engineering process practice,generating a large number of stable in the iterative process The characteristics of good results provide a certain reference and reference for credit risk prediction.2.Contrast analysis of the improvement of the effect of feature engineering on different models,and the interpretation of the importance of features.The combination of logistic regression,support vector machine,random forest,gradient boosted tree classifier,and cross-validation was selected for training,and a variety of evaluation indicators were used to comprehensively evaluate the characteristic effects.The results show that compared with the original features,the feature engineering process has a higher improvement effect on different models,and feature engineering has practical application value.At the same time,before and after feature selection comparison analysis,an embedded feature selection algorithm is added before the training process for pre-training to filter out irrelevant redundant features,which further improves the model effect.
Keywords/Search Tags:Credit risk prediction, Data preprocessing, Machine learning, Feature engineering, Feature selection
PDF Full Text Request
Related items