Research On Identification Of Financial Fraud By Integrating Latent Semantic Features Of Annual Report Text With Accounting Indicators

Posted on:2024-09-22

Degree:Master

Type:Thesis

Country:China

Candidate:C C Cheng

Full Text:PDF

GTID:2569307052471584

Subject:Management Science and Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,there have been numerous cases of financial fraud by domestic and international listed companies,such as Xin Jiang Ready,Kingenta ’s "￥20 billion in inflated revenue",Colin’s "fancy shell",and US companies such as World Telecom and Wells Fargo being involved in fraud and on the verge of bankruptcy.Fraud by listed companies not only leads to a decline in the credibility of information disclosed by the market,but also seriously damages the legitimate rights and interests of investors,brings wealth loss to society as a whole,and hinders the sustainable and healthy development of the capital market.Therefore,how to identify and prevent corporate financial fraud is a hot issue of great concern to the industry and academia.In the context of diversified business of listed companies,the increasing complexity of accounting has led to more hidden frauds and more sophisticated manipulation methods,and it is difficult to detect and prevent frauds comprehensively with quantitative financial information only.This paper first uses the annual reports issued by listed companies from 2001 to 2020 as the textual data source,extracts the latent semantic meaning contained in the text of annual reports through LDA thematic modelling,constructs an econometric model between fraud manipulation and latent semantic features of annual reports,and verifies the relationship between fraud manipulation and latent semantic features of annual reports;then integrates accounting indicators with latent semantic features of annual reports and textual language features to form a new feature index.Finally,the Stacking integrated learning algorithm model,which combines linear model and tree-based model,single classifier and combined classifier,is constructed and compared to analyse the recognition effect of each model.The accuracy of the Stacking-based classification model is higher than that of other models.The results of the study show that:(1)When companies engage in financial fraud,they will strategically manipulate the text,which is manifested in the following ways:the descriptions of risk-related are basically equivalent but the descriptions of idiosyncratic risks are relatively reduced and the descriptions of non-idiosyncratic risks are relatively increased.(2)The latent semantics of the annual report can provide more information than the linguistic features of the text when identifying financial fraud.(3)The latent semantic features embedded in annual report text provide more incremental information than MD&A text.(4)The Stacking integrated learning model is significantly more effective than other classifiers.

Keywords/Search Tags:

Financial fraud, Text mining, Topic extraction, LDA, Stacking

PDF Full Text Request

Related items

1	Research On Financial Fraud Recognition Of Listed Companies Introducing MD&A Text Information
2	The Topic Mining And Its Application Of Stock Investment Based On Guba Text
3	Research On Financial Report Fraud Identification And Prediction Model Of Listed Companies Based On Stacking
4	Online Recruitment Information Analysis Based On Text Mining
5	Research On Breaking Topic Detection Technology For Food Safety Topics
6	Research On Extraction And Evolution Of Hot Topics In Scientific Literatur
7	Research On Information Extraction Technology And Its Application In The Financial Field
8	Research On The Comprehensive Evaluation System Of Popular Scenic Spots In Shanxi Pro Vince Based On LDA Topic Model
9	Research On User Reviews Of TWS Headset Based On Text Mining
10	Research On The Differences Of Topics Of Foreign Consumers’Online Reviews Based On Text Mining In The Context Of Cross-border E-commerce