Font Size: a A A

Research On Text Classification Algorithm Fusion Label Information And Capsule Network

Posted on:2022-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:T Q LaiFull Text:PDF
GTID:2518306782952269Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,network information is showing an explosive growth trend.While users can easily obtain information,the value of such information cannot be guaranteed.The text classification task provides a basis for obtaining valuable information,and how to achieve efficient text classification is a very meaningful work.Existing text classification methods usually use methods based on deep learning.However,these methods often have some shortcomings.The relationship between labels and text is ignored because label information is not considered.Introducing too much label information will also bring too much noise to the model.In addition,the current effective classification amplification requires the support of a large amount of label data.In the case of only a small number of sample data,this will lead to poor classification performance.In response to the above problems,this thesis has done the following work:Aiming at the influence of label information on text,this thesis proposes a label-aware multi-head attention network model(LMAN).The LMAN model incorporates label information for text by introducing label attention and gating mechanisms.Specifically,in terms of text representation,Bi-LSTM and multi-head attention mechanism are used to obtain local features of text.Then the attention mechanism is used to calculate the correlation between tags and text,and the text representation based on the specific tag of each tag is obtained.At the same time,considering that irrelevant labels will bring noise during model training,a relational gating mechanism is introduced to filter the original text representation and the text representation fused with label information to obtain the final appropriate text representation.Finally,classify according to the final text representation.Experiments on the AAPD and Kan Shan datasets demonstrate that the LMAN model can be well applied to multi-label text classification tasks,and the classification accuracy is better than most baseline models.Aiming at the problem that only a small amount of sample data will lead to low classification effect,this thesis proposes a small-sample text classification model(AFCN)fused with capsule network.Different from traditional text representation methods,the AFCN model adopts a fine-grained representation method,using word embedding,part-of-speech embedding and character embedding for text representation.Afterwards,dual-layer attention is introduced on the original Bi-GRU network to obtain rich feature information in the text.Finally,the dynamic routing module in the capsule network is used to learn the mapping relationship between text and class,obtain the output capsule,and complete the text classification.Experiments on the ARSC dataset prove that the AFCN model can effectively solve the problem of small sample classification.Compared with other baseline models,the effect of the AFCN model has been improved.
Keywords/Search Tags:text classification, multi-label, label-aware, few-shot, capsule networks
PDF Full Text Request
Related items