Design And Development Of A Text Classification System For Electric Power Audit Issues Based On Semantic Matching

Posted on:2024-02-07

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Feng

Full Text:PDF

GTID:2542307106989999

Subject:Computer technology

Abstract/Summary:

Auditing is an independent evaluation of a company’s financial condition and operational activities.Through auditing,problems and risks can be identified and improvement suggestions made,which helps companies better manage risks and increase business efficiency and effectiveness,thereby enhancing their competitiveness and sustainability.The internal audit targets of power grid enterprises include financial statements,internal control,business operations,and risk management,among others.Auditors are required to manually examine these factors to identify and record problems and anomalies.Based on the nature,scope,causes,and differences in the solutions to the problems,auditors summarize and classify them,identify common reasons behind different classes of problems,and provide targeted recommendations and improvement measures,thus improving the quality and efficiency of audit work.The traditional method of power audit problem classification relies mainly on the personal experience and ability of auditors.The subjectivity of individual differences often leads to problems such as irregular classification and inconsistent results,which affects the efficiency and accuracy of audit work.To solve these problems,it is necessary to have a standardized and unified auditing problem classification tag library with standard reference meaning.Based on this tag library,text classification technology can be used to effectively and uniformly qualify and classify the discovered problems in power audit.Both traditional machine learning text classification algorithms and current deep learning or pre-trained language model text classification algorithms require a certain number of training samples.However,due to the specialization and complexity of power audit problem texts,it requires professional auditors to annotate them,and the cost of marking a large number of data is high.Moreover,because audit problem texts involve sensitive information of companies,many companies and organizations are not willing to share this data,making it difficult to obtain comprehensive and adequate numbers of audit problem text data.To solve the problem of insufficient sample training,this thesis designs a power audit problem text classification model based on the powerful semantic representation ability of pre-trained language models and semantic matching,and builds a classification system based on the classification model.On one hand,the classification system helps auditors to standardize and organize historical power audit problem texts and to establish a comprehensive and rich data set of power audit problem texts,laying the foundation for training high-precision classification models.On the other hand,it standardizes the classification of newly added power audit problem texts,reducing the subjectivity and inconsistency of power audit problem classification.In this thesis,we first design the weighted cross-matching and the selection ROM Chinese semantic related model to address the problem of insufficient samples for training classification models and to improve the accuracy of semantic matching,which are used to organize historical power audit problem texts and classify newly-added power audit problem texts.The weighted cross-matching model cross-matches the hierarchical tags of auditors’ subjective qualitative tags of historical data with the hierarchical tags of standard classification tags in the library,and assigns higher weights to deep-level matching results,thus reducing the impact of inconsistent semantics between the two types of tags.Using the ROM model to semantically match short text classification tags and long text audit problems,its advantages in training on the Baidu search collection and the consideration of word weight mask strategies are utilized to reduce the impact of length and semantic differences on semantic matching.Secondly,based on the above model,this thesis adopts the Vue and Spring Boot frameworks to design and implement a text classification system for power audit problems.Auditors can upload and manage power audit problem texts in the classification system,query the corresponding standard classification labels for the problem texts,and when confirming the classification labels,auditors can see the power audit problem texts under the standard classification labels to assist in judgment.In cases where the system classifies the text incorrectly,auditors can provide feedback on the problem text and submit it to the system for confirmation of the feedback classification labels by the administrator.At the same time,the system establishes some data statistical indicators and visualizes them to better understand the number and category distribution of the power audit problem text dataset and the system’s classification performance.Finally,we tested the classification model designed in this thesis on the summary table of audit problems of a certain power company of the State Grid,and the experiment verified the effectiveness of the two models and their good accuracy.The functional test of the entire power audit problem text classification system showed that the system meets the requirements for practical use.

Keywords/Search Tags:

power audit, audit problem text, text classification, semantic matching

Related items

1	Research On The Evaluation Of Electric Power Construction Project Based On Intelligent Audit
2	Defect Analysis Of Power Equipment Based On Text Mining Technology
3	Research On Intelligent Recognition And Application Of Alarm Information In Power Grid Based On Text Mining
4	Water Conservancy Text Classification Model Based On Lstm And K-means Clustering
5	A Case Study Of Sinovel Wind Power Audit Based On Risk-oriented Audit
6	The Research On The Integration Audit Of Da Hua Certified Public Accountants To JIANGXI LIANCHUANG Optoelectronic Technology Co.,Ltd
7	Multi-label Classification Of Power Defect Text Based On Deep Learning And Normative Evaluation
8	A Case Study Of IPO Audit Failures Based On The Revised G&B Audit Conflict Model
9	Research On Application Of Digital Audit In H Power Company
10	A Classification Model Of Power Equipment Defect Record Texts Based On Multi-head Attention RCNN Network