| Compared to developed economies,China’s credit bond market has a relatively shorter history and is displaying a tendency of maturation and normalization.Market participants are constantly enriched in categories.Local financing platforms are undergoing market-oriented transformation.Investors pay more attention to credit risk.The credit spread of a bond reflects how market prices the bond’s credit risk and plays a guiding role in the allocation of resources.It has drawn the attention of academics,investors and policy makers.The realistic factors that can affect credit spreads of bonds are quite complex and diverse.However,empirical analysis of existing literatures themed on credit spreads in China still mostly adopt traditional parametric regression models,which do not work very well on multivariate and high-dimensional samples.In this context,this paper uses the random forest regression model,a machine learning algorithm,combined with the status and characteristics of China’s bond market,to analyze the factors that can influence credit spreads.First of all,this paper sorts out predecessors’ research results on related issues,including researches at home and abroad on factors that influence credit risk pricing and classical theoretical models of credit risk pricing.Based on previous studies and characteristics of China’s bond market,this paper selects factors respectively at macro,meso,micro and one-bond level that may have a significant impact on credit spreads of bonds and conducts theoretical analysis on the impact mechanism of these factors on credit spreads respectively.After that,we use the random forest model to carry out regression analysis of the bond samples to get the importance scores and rankings of the explanatory variables and make partial dependence analysis on the explanatory variables with the heighest importance scores.Finally,reasonable explanations on results are provided.This study finds that risk-free interest rates at the macro level,regions and industry classifications at the meso level,credit ratings and property rights of companies at the micro level play a leading role in credit spreads when issuing bonds.In addition,the two factors that influence a bond’s liquidity and financing convenience,the category of investors and whether it can be pledged on the exchange floor,also explain the credit spreads strongly.Through further classification and comparison,this paper also concludes that the market still believes that Quasi-municipal Bonds own implicit governments’ guarantees and small-sized enterprises that work on city construction are more likely to issue private placement bonds successfully.Market’s opinions on bonds issued by entities with a higher credit rating are closer to opinions on risk-free bonds.As for the research method,the random forest method has the advantages of high prediction accuracy and good ability to deal with high-dimensional data.The paper is the first paper that uses a random forest regression model to present empirical analysis of factors that affect credit spreads.By combing the application of random forest method in the field of economic research in our country,it is found that this method has better predictive effect than other common regression models in analyzing multivariate samples.The theoretical basis of this paper introduces the working principles,advantages and disadvantages of the random forest regression model and proves theoretically that the method is suitable for the regression analysis of credit spreads.Then in the empirical process,the interpretation degree of the model for the training set samples is above 80%and the fitting degree of the model is high through the four-fold cross test.In the test of the test set samples,the absolute error,relative error and result of a four-fold cross test,draw a conclusion that this model has high prediction accuracy. |