In recent years,the Internet financial has experienced great development,the number of P2 P platforms and network loan demand all showing a rapid growth trend.And it becomes more and more important that evaluating credit risk rating with scientific and quantitative methods,especially for P2 P platforms,who has huge customer quantity but low borrowing amount per loan.What’s more,most of the customers in P2 P platforms have no credit recode,which brings a lot of difficulties for evaluating their credit risk rating with only basic information.In fact,the P2 P platforms have certain advantages in its data,this paper based on the loan risk data,which include the customer’s login-info and update recode,use feature engineering and xgboost to dig the relationship between the data and customer’s default behavior.In this paper,we find that customers who always update their information before loan are more likely to default than others in later 6 month.Then,this paper based on the feature created by feature engineering step constructed a default prediction model.Firstly,we combine the filler and wrapper selection to obtain the best feature subset.Then,we use Xgboost algorithm framework to train model,and obtain an effective default prediction model,which performs very well in prediction accuracy and stability.In this paper,the loan risk data comes from ones who have already successful get the loan,which means it predict the default probability for those who are considered not to default in existing risk control system.Considering the above situation and the model result,this paper suggest that it is useful to take the update and login-info data into risk control system,but it’s better to create some rules and use it to adjust the risk rating rather take it into main model of risk control system directly. |