Font Size: a A A

Research On The Application Of Data Mining Technology In Financial Fraud Audit

Posted on:2021-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:X P LiFull Text:PDF
GTID:2518306461973839Subject:Business Statistics
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,the integration of daily operation and management activities of enterprises with computers is becoming more and more high.The information environment of risk-based audit mode has changed greatly,and the enterprise information that auditors face is different from the past in terms of volume and dimension.To identify,evaluate and deal with the fraud risk of enterprises requires more and more professional competence of auditors,and the risk of audit failure is also increasing.However,the arrival of the era of big data also brings new technology and methods to the audit of financial fraud.In order to improve the auditor's ability to identify fraud risk in financial statement audit,improve the trust between enterprises,regulators and capital market participants,and give full play to the core role of capital market in the effective allocation of resources,the main work of this paper is embodied in the following three parts:First of all,the financial fraud pattern,financial fraud identification and data mining technology are introduced.Financial fraud pattern is a qualitative research on the definition,motivation and characteristics of financial fraud;financial fraud identification is based on the theory of statistics,machine learning,data mining and other theories to identify financial fraud of enterprises;data mining is a process of finding useful patterns and trends from large data sets,which has been applied in financial fraud identification by a large number of scholars and achieved good results.Secondly,preparing the empirical data needed for the construction of financial fraud identification model,including the collection of fraud enterprise samples,non fraud enterprise samples and the construction of fraud identification feature set.The sample of fraudulent enterprises is 158 listed companies that have three kinds of violations in 2000-2018: fictitious profits,false assets and improper general accounting treatment;the sample of non fraudulent enterprises is 158 listed companies that are in the same industry as the fraudulent enterprises and have the most similar total assets in the two years before and after the fraud year;the construction of financial fraud identification feature set is based on the first part of the financial fraud pattern and financial fraud identification research.Firstly,the primary feature set is selected according to the financial fraud motivation theory,and then the original feature set is obtained by Mann Whitney test.Then,the final fraud identification feature is selected from the original feature set by using Relief and bortua algorithm.Finally,based on decision tree,logistic regression,support vector machine and random forest,the financial statement fraud identification model of listed companies is constructed.According to the order of the original fraud identification characteristics,the fraud identification characteristics constructed by boruta algorithm and the fraud identification characteristics constructed by relief algorithm,the data of fraud samples and non fraud samples are respectively loaded into four financial fraud identification models of listed companies.The model identification results show that the combination of financial fraud identification characteristics constructed by relief algorithm and random forest model has the best identification effect.The overall evaluation indexes of G and F were 75.88% and 78.25% respectively.According to the feature set of financial fraud identification constructed by relief algorithm,the following conclusions can be drawn:(1)fraudulent enterprises have weak solvency,high debt risk,strong financing willingness,and the cash flow generated by operating activities is lower than other normal enterprises in the same industry.Financial indicators such as cash flow ratio,equity multiplier,net cash flow from operating activities per share and so on are good characteristics of fraud identification;(2)fraudulent enterprises have poor asset status,slow turnover speed,and the profitability and growth ability of assets are lower than other normal enterprises in the same industry.The financial indexes such as inventory turnover rate,accounts receivable turnover rate,return on net assets,return on net assets growth rate and sustainable growth rate are good characteristics of fraud identification;(3)the growth rate of cost and expense of fraudulent enterprises is higher than that of normal enterprises in the same industry,and the comprehensive tax burden of enterprises is lower than that of other normal enterprises.Financial indicators such as cost profit rate,asset impairment loss rate,sales expense growth rate and comprehensive tax rate are good characteristics of fraud identification.
Keywords/Search Tags:Data Mining, Financial Fraud Identification, Feature Selection, Random Forest
PDF Full Text Request
Related items