Research On Text Classification Method And Its Interpretability Based On Attention Mechanism

Posted on:2023-09-23

Degree:Master

Type:Thesis

Country:China

Candidate:J Peng

Full Text:PDF

GTID:2558307070483264

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Text classification refers to automatically classifying and marking texts according to certain rules and standards.In recent years,natural language processing has made remarkable progress with the support of deep learning technology.Text classification tasks based on deep learning,on the one hand,continuously pursue the improvement of accuracy rate,recall rate and other indicators,on the other hand,put forward higher requirements for task model interpretability,robustness,parallelism,training speed and other aspects.Based on the attention mechanism,this thesis develops a text classification model using Encoder as the basic framework.The main research contents and innovations of this thesis are as follows:The text classification model Encoder＿FC is proposed on the basis of self-attention.RNN can be used to collect and transmit contextual information and is a classic model for natural language processing tasks.For text classification,especially long texts,the serial structure of the RNN can make the model run at a high time step.This thesis introduces a self-attention algorithm and uses the self-attention algorithm to obtain the contextual dependencies of the text.Encoder＿FC reduces the time steps required to run the model through parallel computing,and improves its compatibility with long texts.Further,this thesis introduces the idea of hierarchy,reduces the time and space complexity of the self-attention algorithm by cutting tensors,and improves the training speed of the model.Experimental results show that the above method is feasible on long text classification tasks.Proposed Topic Attention mechanism and general text classification model TAC(Topic Attention Classifier)with decision interpretation ability.Encoder＿FC and B-En FC are based on self-attention algorithm and full connection layer,and self-attention is prone to degradation in the training process,so that a large number of parameters are discarded implicitly.Based on this discovery,this thesis propose a more effective mechanism of attention categorization--Topic Attention Mechanism.TAC establishes a separate switch for each minimum input to control the impact of different inputs on different categories,allowing attention computing to focus on serving classification and providing decision interpretation.In this thesis,TA-Base,the basic model of TAC,is proposed,which uses a small parameter space to complete text classification.When tested on THUCNews Chinese text dataset,the training speed of this model is significantly faster than that of the comparison model,and the accuracy of the validation set is 83%.The TABase probability matrix is visualized to observe the model’s attention to each input during classification.An Attention Talk Unit(ATU)is proposed to enhance the expression ability of Topic Attention.TA-Base focuses on discrete text content,which limits the performance of this model in text classification.ATU helps the model collect more complete information by fitting multivariate distributions with multiple unary distributions.This thesis designs three communication modes for ATU: adjacent communication,topic communication and joint communication.Furthermore,this thesis proposes the TAC-V model based on the adjacent communication,and tests it on THUCNews and IMDB data sets.The model has achieved good performance in the indicators of accuracy,recall rate,accuracy,F1 value,training speed and so on.

Keywords/Search Tags:

Text classification, Attention mechanism, Segmented attention, Thematic attention, Attention communication unit, Interpretability of neural network

PDF Full Text Request

Related items

1	Semi-supervised Text Classification Based On Graph Attention Neural Networks
2	Text Classification In Bidirectional GRU Network Based On Attention Mechanism
3	Research On Text Classification Model Based On BGRU And Self-Attention Mechanism
4	Research On Classification Of News Text Based On Deep Learning
5	Text Classification Research Based On Deep Neural Network And Attention Mechanism
6	Text Representation And Classification Based On Deep Learning With Improved Attention Mechanism
7	Research On Aspect Sentiment Classification Based On Attention Mechanism
8	Text Sentiment Classification Based On Attention Mechanism
9	Research On Long Text Classification Algorithm Via Multi-model Fusion With Attention Mechanism
10	A Research Of Sentiment Classification Algorithm And Application Based On Attention Mechanism