Font Size: a A A

Research On Imbalanced Classification Problem Based On Random Forest-Adaboost

Posted on:2022-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:X H WangFull Text:PDF
GTID:2518306527452264Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the construction of a more secure and stable domestic currency system and the advancement of RMB move toward an international currency,the vigorous development of Chinese credit loan business has promoted the development of the national economy and the improvement of economic benefits.Meanwhile,potential new risks emerge with the increased demand in credit loan business.The establishment of a safe and effective credit loan qualification reviewing system becomes more important under this circumstance.Taking advantages of data storage innovation,such as speed,safety and capacity,financial institutions continuously obtain various transaction data in complex scenarios and perform accurate credit loan assessments based on the data.In the age of big data,financial institutions are able to extract massive amounts of business data safely.Making full use of the borrower's user profile data to accurately predict credit loan risk has become a subject of practical significance.This thesis uses machine learning algorithms to effectively classify borrowers using real credit loan data.This can help investors decide whether to accept loans.In this thesis,a variety of resampling methods are used to balance the data set.These data set balancing methods can be combined with machine learning algorithms to accurately predict credit loan risk.By balancing the data set,the performance of the classification model is greatly improved,which can assist the financial platform in making more reliable and safe loan decisions.This thesis proposes a hybrid machine learning model,which is a deep coupling to mainstream machine learning algorithms and can effectively improve the ability of detecting high-risk loans.The experimental results show that,compared with logistic regression,random forest,and Adaboost,the method proposed in this thesis can more effectively identify high-risk loans on the personal credit loan data with extremely imbalanced distribution.And the model has better performance among multiple performance metrics.
Keywords/Search Tags:Credit Loan Risk, Classification, Random Forest, Adaboost
PDF Full Text Request
Related items