With the enrichment and development of securities market products,financing through listing or issuing bonds in the securities market has become a common financing method for enterprises.Due to information asymmetry between enterprises and investors in the securities market,the occurrence of financial statement fraud can greatly affect investors’ investment decisions.Currently,financial fraud companies may have long-term systemic financial fraud,characterized by complex and covert fraudulent methods and constantly evolving forms.With the increasingly complex and covert means of financial fraud by enterprises,the penalties imposed by regulatory agencies in China are relatively lagging behind.Therefore,constructing a financial fraud identification model to comprehensively,accurately,and efficiently evaluate the financial fraud risks of listed companies is of great significance.In the process of building the financial fraud identification model,most of the existing studies use the feature variables selected in advance.The selection of model input features is based on experts’ prior knowledge of the accounting items in the financial statements that may be involved in financial fraud.When financial fraud overflows experts’ knowledge,the identification accuracy may decline.In addition,in the process of research on the identification of financial fraud of listed companies at home and abroad,most of the previous documents have adopted the empirical method of matching fraud samples with non-fraud samples,without considering the scarcity of financial fraud samples under the actual situation,and the matching or sampling link of the empirical process has a certain degree of subjectivity,making the prediction results may be affected by a priori probability to a certain extent.Therefore,this article reviews existing research literature on financial fraud,draws on the input indicators of financial fraud identification models proposed by domestic and foreign research,and proposes to use all accounting subjects from three financial statements of domestic listed companies to construct a financial fraud identification index system for predicting and analyzing the financial fraud risks of listed companies,To maximize the use of accounting information of listed companies in China in the process of identifying financial fraud and reduce the sensitivity of financial fraud identification models to expert prior analysis.In addition,in view of the scarcity of financial fraud samples and the internal relevance of financial statement indicators,this paper selects an integrated learning model that can simultaneously process data imbalance and select features to identify financial fraud of listed companies in China,combined with the intertemporal identification design of financial fraud and the analysis of the importance of features,To construct a financial fraud identification model suitable for analyzing the linkage between various financial indicators in the financial statements of listed companies in China.Based on theoretical and empirical research results,the main conclusions of this article are as follows:(1)Using all accounting subjects from three financial statements as input indicators for the financial fraud identification model can significantly improve the performance of the financial fraud identification model,while strengthening the ability to distinguish between financial fraud samples and non fraud samples.The improvement effect of all accounting subjects in the three financial statements as input indicators under various classifier models is consistent and stable,and the improvement effect brought by the indicators is not affected by the classifier model.(2)Compared to the classic binary classification models Logit and SVM,the ensemble learning model performs better in financial fraud identification models.Among the four integrated models used in this article,the XGBoost model has the best performance in identifying financial fraud among all financial statement indicators.(3)Compared to using income statement and cash flow statement accounts,using balance sheet accounts is more conducive to improving the accuracy of financial fraud identification for listed companies.Among the specific subjects of the three statements,the five indicators of basic earnings per share,prepayments,total assets,dividends paid by subsidiaries to minority shareholders,profits or interest paid,and employee compensation payable are more important in the financial fraud identification model of listed companies,and have statistical significance in the differences between fraudulent and non fraudulent samples. |