Font Size: a A A

Research On Financial Fraud Identification Of Listed Companies In My Country Based On Ensemble Learnin

Posted on:2024-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:J W JiangFull Text:PDF
GTID:2569307067478144Subject:Statistics
Abstract/Summary:PDF Full Text Request
Nowadays,the Chinese economy is moving towards high-quality development,which poses higher requirements for China’s capital market.However,in recent years,the financial fraud incidents in Chinese listed companies still occur from time to time,seriously impeding the high-quality development of our economy.Therefore,it remains crucial to explore methods that can effectively detect the financial fraud of listed companies.In this context,with the relevant theories of financial fraud supported,this paper aims to build a model that can accurately detect the fraudulent behaviors of listed companies through ensemble learning methods,so as to effectively prevent the financial fraud of listed companies and create positive environment of our capital market.Firstly,this paper focuses on A-share listed companies in Shanghai and Shenzhen stock markets from 2012 to 2021,and then studies the characteristics of financial fraud from four aspects: annual distribution,industry distribution and so on.Then,based on the financial indicators and financial statement indicators,this paper designs the industry deviation indicators,growth rate indicators and ratio indicators,considering the quality of accounting information indicators and the non-financial indicators at the same time,to construct the indicator system of financial fraud identification.After imputing the missing data,conducting the significance test and correlation analysis on indicators,two feature subsets are obtained by using Boruta algorithm and Lasso algorithm for feature selection respectively.And then,based on the two different feature subsets,single identification models are constructed by using ensemble learning algorithms,which are Random Forest,GBDT and XGBoost.In order to further optimize the model and improve its ability to identify fraudulent companies,this paper uses the three pre-built single models as the based model,Logistic regression model and Support vector machine model as the secondary model,to construct the fusion model using Stacking ensemble strategy on the two features subsets accordingly.Finally,this paper contrasts the identification performance of various models and verifies the effectiveness of the optimal model.The results of this paper show that:(1)In terms of the characteristics of financial fraud in listed companies: there is a significant delay in detecting financial fraud of listed companies.Industries with a high incidence of fraud are concentrated in the manufacturing industry and the information transmission,software,and information technology services industry.In addition,the fraudulent means of listed companies are becoming more insidious.The type of accounting information disclosure fraud accounts for the majority,and the concurrency of fraud is very serious.(2)The identification performance of each single model based on the feature subset under the Boruta algorithm is superior than that based on the Lasso algorithm.That is to say,the Boruta algorithm is more capable of extracting useful information from the indicator system after adding the industry deviation indicators,growth rate indicators and the quality of accounting information indicators.(3)It is found that the three single models all show good recognition effects,the identification rate of fraudulent companies is maintained over 70%,and the AUC value is around 0.8.(4)The Stacking-Logistic model significantly improves the identification rate of fraudulent companies while enhancing the overall performance of the model.The optimal model obtained in this paper is the combination of the feature subset under Boruta algorithm and the Stacking-Logistic model.Its recall rate reaches80.23%,and the AUC value even reaches 0.8521,which can effectively identify fraudulent companies.And it also demonstrates that the more comprehensive indicator system constructed in this paper has certain value in the research of financial fraud identification.
Keywords/Search Tags:Financial fraud, Ensemble learning, Identification model, Fusion model
PDF Full Text Request
Related items