Font Size: a A A

History-based Attention In Seq2seq Model For Multi-label Text Classification

Posted on:2021-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2518306122464134Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,a large amount of text data is generated on the network.How to classify these texts quickly is an urgent problem to be solved.Traditional machine learning algorithms are limited to extract text features.In recent years,with the rapid development of deep learning algorithms,the extraction of text semantic information is more precise and perfect,thus laying a solid foundation for the improvement of text classification performance.Generally speaking,a text can belong to many categories in multi-label text classification.At present,the Sequence to Sequence model is more commonly used to deal with multi-label text classification.It uses an encoder module to extract text features,and then uses a decoder to sequentially output multiple categories of text.Compared with another deep neural network models,the Seq2Seq model's attention mechanism can highlight key information in the text,thereby improving the classification effect of the model.However,the current Seq2Seq model faces several problems when dealing with multi-label text classification problems.First,the classification results are prone to fall into the "label trap",that is,the generated classification results are easily limited to some mainstream labels and ignore other labels.Next,the serialized output structure will make the wrong prediction continue to propagate backwards,affecting the subsequent classification results.Finally,the current Seq2Seq model is insufficient for mining associations between labels.Based on the above defects,this paper explores the history-based attention mechanism to improve the performance of multi-label text classification.The system uses the currently popular Seq2Seq model,and improves the attention mechanism from the following aspects:1.This article integrates the Bert model to solve the polysemy problem.In the encoder of the Seq2Seq model,this article replaces the word embedding layer with a pretrained Bert model to solve the polysemy problem that the traditional Word2vec model cannot handle,enhancing the semantic information understanding ability of the model;2.This paper proposes the history-based context attention mechanism to overcome the shortcomings of the attention mechanism of the Seq2Seq model.Compared with the attention mechanism of the Seq2Seq model,the history-based context attention mechanism will take into account the weight change trend of words when assigning weights to words,and adjust the weight of each word according to this trend.This can help the model to understand the semantic information of the text more accurately,thus avoiding the model from falling into the "label trap".3.Learning the correlation between labels to guide model classification.The Decoder module of the Seq2Seq model is modified to introduce an history-based label attention mechanism,which is used to learn the correlation between labels.At each moment of the decoder,history-based label attention mechanism will dynamically extract the label information and input it into the model,so as to avoid the problem of "error propagation " of the model and guide the model to make a more accurate prediction.The experimental results show that the history-based attention mechanism has a certain performance improvement over the traditional attention mechanism.It increases the MicroF1 score of 1.17%and reduces it by 5.95%Hamming loss under the condition that the labels are not sorted.Meanwhile,through the integration of reinforcement learning and the pretrained bert model,the model proposed in this paper has a certain performance improvement compared to the current state-of-the-art methods.The Micro-F1 score is increased to 0.895 and the Hamming loss is reduced to 0.68 × 10-2.
Keywords/Search Tags:multi-label classification, deep neural network, Seq2Seq, attention mechanism
PDF Full Text Request
Related items