Research On Text Classification Method Of Quality Risk For Customs Home Appliances Based On Deep Learning

Posted on:2024-09-22

Degree:Master

Type:Thesis

Country:China

Candidate:Z W Cai

Full Text:PDF

GTID:2542307127954639

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

With the development of global trade in household appliances,the issue of product quality has received widespread attention.However,the constantly emerging quality evaluation information of household appliances on the Internet makes it difficult for customs officials to grasp the categories and levels of quality risks in the household appliance market,and thus it is challenging to effectively control the relevant household appliance production enterprises.Therefore,automatic collection of household appliance quality risk information from the Internet and automatic classification of household appliance quality risk texts using powerful text understanding capabilities of large-scale language models currently popular,on the one hand,is expected to improve the efficiency and accuracy of customs officials’ review work,and on the other hand,can provide strong support for efficient customs supervision measures on household appliance quality and safety.This paper focuses on the difficulties and limitations of applying large-scale language models to the scenario of customs household appliance quality assessment.Specifically,it discusses the challenges of large model size,low operational efficiency,high costs,and model robustness.Relevant algorithms are proposed to address practical issues.Through the design of a customs household appliance quality risk assessment system,automatic collection and classification of household appliance quality risk information from the Internet ca n be achieved.This will effectively improve the efficiency and accuracy of customs officials’ reviews,reduce manual review costs,and provide strong support for customs to formulate more efficient policies for household appliance quality and safety super vision.The main work of this study includes:1.Aiming at the issues of poor robustness and vulnerability to attacks in practical applications of large-scale language models,a new model called AT-NEZHA,based on adversarial training and NEZHA model,is proposed.This model incorporates adversarial training strategy into the embedding layer of the language model to train the model to obtain more robust parameters.Meanwhile,in order to better handle the linguistic characteristics of Chinese text,such as word combinations and grammar structures,the NEZHA model,which is pretrained specifically for Chinese,is adopted.Finally,experiments are conducted on both the enterprise security risk classification dataset and the customs household appliance quality risk dataset for validation.The experimental results show that the F1 score of the AT-NEZHA model reaches 96.45%,which is 2.19% higher than traditional text classification methods and the baseline model(BERT)for Chinese text classification,demonstrati ng the effectiveness of the AT-NEZHA model.2.To address the issues of long training time,increased complexity,high operational costs of large pre-trained models,and difficulties in practical applications caused by massive datasets,a new accelerated model called AT-Kee BERT is proposed,based on self-distillation,dynamic early exit strategy,and adversarial training.This model achieves model compression by distilling its own network,and utilizes dynamic early exit strategy to allow simple samples to exit from shallow layers of the network adaptively,reducing the computational overhead of the model.Finally,adversarial training algorithm is incorporated to enhance the model’s generalization ability.The effectiveness of the proposed method is validated on three relevant datasets in the field,as well as the customs household appliance quality risk dataset.Compared with the baseline model BERT for text classification,AT-Kee BERT model achieves over 4-fold improvement in operational efficiency with minimal performance loss.Experimental results demonstrate that AT-Kee BERT model can maintain the model’s performance to the greatest extent while achieving model acceleration.3.To meet the application requirements of customs household appliance quality ris k assessment,based on the classification algorithm proposed in previous research,a simple customs household appliance quality risk assessment system was designed and implemented from an application perspective,using Django technology in Python as the underlying web framework.The system consists of a frontend and a backend.The frontend mainly implements data collection and text annotation functions,while the backend implements text classification and data storage functions.After testing,the overall accuracy of the system reached 78.9%,with an average classification time of about 2.4 seconds for individual risk samples.Compared with manual risk classification,the efficiency was improved by about 6times,which can enhance the work efficiency of customs household appliance quality risk text classification tasks and has practical value.

Keywords/Search Tags:

deep learning, text mining, pre-trained model, adversarial training, model acceleration

PDF Full Text Request

Related items

1	Research On Analysis Method Of Unstructured Documents In Power Grid Based On Deep Learning
2	Text Mining Based Analysis For Construction Engineering Accident Reports
3	Research On Analysis And Mining Method Of Railway System Fault Text Data Based On Machine Learning
4	Research On Kansei Engineering For The Exterior Design Of New Energy Vehicle Based On User Comment Mining
5	Research On Transfer Fault Diagnosis Method Of Rolling Bearings Based On Deep Domain Adversarial Neural Network
6	Research On Patent Text Classification In Aerospace Field Based On Deep Learning
7	Study On Text Mining Based Fault Classification For Turnout
8	Research On User Comment Of Sweeping Robot Based On Text Mining
9	Research On Estimating Car-body Acceleration With Track Irregularity Based On Deep Learning
10	Research On Named Entity Recognition In Clock Domain Based On Deep Learning