Font Size: a A A

Research On Chinese Text Classification And Robustness Based On Attention Mechanism

Posted on:2022-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:C C WangFull Text:PDF
GTID:2518306605465574Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text classification is the most basic and important prerequisite task in natural language processing.It provides classification prediction results for the text input into the classification model according to the pre-set classification criteria and categories.The effect of text classification directly affects other subsequent processing tasks,so it is a meaningful topic to study how to design a highly reliable and robust text classification model.The general process of text classification is to extract the feature of the word embeddings obtained by text mapping,and then use the comprehensive semantic feature as the text representation for text classification.Traditional word embeddings such as Word2 Vec are unable to deal with synonyms in text representation,while text feature extraction modules based on CNN and LSTM have shortcomings such as ignoring text sequence relations and information loss.In addition,the existing text classification research mostly focuses on the accuracy of the classification model,but pays little attention to the stability and robustness of the classification model.This thesis focuses on the text classification method and its robustness under the attention mechanism.The specific works are as follows:(1)In order to improve the accuracy of the classification model,a text classification algorithm,ERNIE-DPCNN,is put forward in this thesis,which integrates attention mechanism and deep convolutional network.Firstly,the feature extraction module based on multi-head self-attention is used to obtain the enhanced vector representation of each word in the text in different semantic space,and to extract the information of different dimensions of the text.Then,the feature extraction module based on deep convolutional network is used to further extract the semantic representation of words in the text as the final representation vector of the text.In order to solve the problem that the model based on cross entropy loss function overlearns text features under the condition of insufficient training samples,which leads to the decline of model generalization ability,a loss function based on label smoothing is introduced to optimize the model.With the comparative experiments of the above algorithms employing Toutiao and Sina News datasets,the results show that feature extraction module based on attention mechanism and deep convolution network can effectively improve the accuracy of text classification,and loss function based on the Label-Smoothing can effectively alleviate overfitting issues of the model under circumstance of small sample training.(2)Aiming at the relatively low robustness of text classification model,a scheme based on adversarial training to improve the robustness of the model is proposed in this thesis.The scheme includes three steps: model pre-training,adversarial attacking and adversarial training.In the step of model pre-training,ERNIE-DPCNN,BERT-CNN and BERT-LSTM are used.In the adversarial attacking step,BERT Attack For Chinese(BAFC)algorithm is proposed to effectively generate adversarial samples.Firstly,the algorithm calculates the importance scores of the terms and rank,and then attacks the terms that are paid close attention by the model.Next,the Masked Language Model(MLM)is employed to generate candidate items replacing attacked terms.In the adversarial training step,the initial samples and the adversarial samples are mixed for further adversarial training on the pre-training model.Experiments are using the datasets of Toutiao and Sina News,and the results show that BAFC algorithm can effectively generate adversarial samples attack target model,and using the adversarial training with adversarial samples,the text classification model can effectively improve its robustness.
Keywords/Search Tags:Text classification, Pre-Training, Attention Mechanism, Adversarial Training, Robustness
PDF Full Text Request
Related items