| In recent years,the frequent occurrence of credit risk and financial statement fraud in China’s financial market has attracted high attention from regulators and academia.For example,Baoshang Bank in the banking system went bankrupt due to serious credit risk,and listed companies such as Kangdexin,Kangmeiyaoye,and Zhangzidao in the Chinese A-share stock market were investigated by regulatory authorities for financial statement fraud.These major events have caused a huge negative impact on the market.Potential credit risk and financial statement fraud have become a major hidden danger to China’s economic development.Since 2022,the perils of credit risk and financial fraud have continued to issue constant warnings.As of the end of 2022,according to statistics from the China Banking and Insurance Regulatory Commission,the balance of non-performing asset loans in China has reached 3.8 trillion Chinese Yuan(CNY),an increase of 169.9billion CNY from the beginning of the year.The continuous exposure of credit risk in some areas urgently needs to be reduced,such as the non-performing loan rate of commercial banks in Hainan Province at the end of 2022 has reached 5.39%.To promote commercial banks to accurately assess credit risks,in February 2023,the China Banking and Insurance Regulatory Commission and the People’s Bank of China jointly issued the“Commercial Bank Financial Asset Risk Classification Method”,which poses new challenges to the credit risk assessment capabilities of various commercial banks.In addition,among the 20 typical illegal cases of securities inspection announced by the China Securities Regulatory Commission in 2022,the illegal facts of listed companies such as Tongjitang,Yujingangshi,Jinzhengda,Shenglijingmi,and *ST Xinyi have been notified for inflating revenue,inflating profits,and continuous financial statement fraud for many years.Under the background of continuous exposure of credit risk and frequent occurrences of financial statement fraud events,how to accurately identify and assess credit risk and financial statement fraud and give early warnings is a huge challenge currently faced by regulators and market participants.Moreover,a fundamental interconnection exists between the risk of financial statement fraud and that of credit risk.The manifestation of financial statement fraud serves as a harbinger of pronounced financial instability within an organization,potentially acting as a critical indicator of impending credit risk.Furthermore,the risk associated with financial statement fraud has the propensity to disseminate across credit networks,thereby amplifying the potential for escalated,widespread credit risk.Identifying and assessing credit risk and financial statement fraud based on multidimensional,detailed,and behavioral data has gradually become a trend in recent studies.This type of data usually has high-dimensional characteristics and complex data features(coexistence of static and temporal features,high proportion of noise,etc.),which pose certain challenges to traditional risk assessment models.Machine learning technology provides a better solution path for this,and has been widely used in the financial field.To this end,this dissertation builds a new credit risk and financial fraud risk assessment model based on the characteristics of relevant credit and financial datasets using machine learning techniques.The specific research content and conclusions are as follows:For credit risk assessment,this dissertation innovates from two dimensions.Firstly,this dissertation propose a Discrete Wavelet Transform based Long Short-term Memory(DWT-LSTM)neural network,which jointly models temporal and static features in multiple scenarios and takes advantage of the fact that wavelet transform processing in the model can successfully extracts the key features and patterns in the borrower’s behavior sequence to effectively distinguish between default samples and non-default samples,and efficiently assess the probability of loan default.This dissertation focuses on personal mortgage loan,which occupies a large proportion in commercial bank business.We have integrated several consumer financial behavioral records,such as credit repayment time series and financial holding time series,to empirically investigate the impact of different consumer behaviors on loan default.The empirical results show that the proposed DWT-LSTM model performs better than the traditional statistical models.By taking advantage of time series data,the predictive accuracy has been increased by 20% compared to the baselines.Secondly,based on a two-stage modeling framework,I separately construct a two-split model and a three-split model,and compare the applicability of parallel and serial integration strategies for these sub-models in the recovery rate prediction scenario.This dissertation find that the distribution of loan recovery rate is bimodal on both sides,that is,the recovery rate is concentrated in the two tails(full recovery,zero recovery),while the middle area(partial recovery)is relatively flat,and the three types of samples have different influencing features on the recovery rate.Therefore,the three-split model performs better than the two-split model in fitting the true distribution of the recovery rate,and shows better prediction performance both in-sample and out-of-sample.In addition,when integrating the sub-models,parallel integration performs better than serial integration.For financial statement fraud,this dissertation introduces an ensemble learning strategy to construct a financial statement fraud assessment model based on the random forest algorithm,which can better capture the characteristics of the financial data of Chinese Ashare listed companies based on the Bagging strategy of parallel integration,effectively smooth the negative impact of noise data,and thus improve the prediction ability of financial statement fraud.At the same time,this dissertation uses the financial statement data of Chinese A-share listed companies from 2012 to 2022 for empirical study,and finds that the random forest algorithm has better prediction power for financial statement fraud than the four algorithms based on the Boosting strategy.In addition,this dissertation further analyzes the changes in the methods of financial statement fraud of listed companies before and after the outbreak of Covid-19 pandemic.We find that since the pandemic,under the pressure of economic downturn,some listed companies have major business problems,and they are more inclined to implement financial statement fraud by means of manipulating their operating quality,including inflating income/profit,reducing reserves,borrowing heavily,increasing assets,increasing receivables,etc..To summarize,this dissertation builds a new model for credit risk and financial fraud risk assessment based on the characteristics of personal credit detailed data,corporate leveraged loan recovery rate data,and listed company financial datasets,and conducts corresponding empirical research.The research results can provide some reference for regulators,financial institutions,and market investors to identify,assess,and earlywarning of corresponding risks. |