With the development of the Internet era,the amount of information of all kinds has grown exponentially and filled people’s lives.Finding the desired information smoothly and quickly from these vast data requires accurate processing of these data.Text classification can effectively classify a large amount of data and greatly improve the efficiency of information retrieval,and the real-life text content is very rich,a text may match with one or more categories,multi-label text classification can determine all the categories to which each text instance belongs,which can be better applied in real life.In this thesis,we analyze the current state of research in the direction of multi-label text classification and try to address the current shortcomings and challenges in this field through attention mechanisms.Most of the existing multi-label text classification models are only suitable for scenarios with a small number of labels and coarse granularity,ignoring the dependency and mutual exclusion relationship between labels and text,which leads to an obvious lack of semantic information.Also in this task most studies learn the same text representation for different labels,it is difficult to distinguish between similar labels,and obtaining label-specific semantic components and exploring the interaction between these components is still to be accomplished.This thesis proposes two multi-label text classification models based on attention mechanisms based on previous work,namely a multi-label text classification method based on entity recognition and bi-directional attention mechanisms and a label-specific multi-label text classification method based on graph convolutional networks.The main research works in this thesis are:(1)A method for entity recognition in the field of multi-label text classification is designed to address the lack of entity information in previous models.In multi-label text classification,some entity information such as proper nouns affects the efficiency of classification.In this thesis,we use bidirectional long short-term memory networks and conditional random fields for entity recognition,and use a self-attentive mechanism to obtain keywords of entity information,generate feature representations of entity information,and then obtain text representations containing entity information,in addition to combining the lexicality of nouns in the word embedding part to enrich the text representations.Experimental results show that entity information improves the accuracy of multi-label text classification tasks.(2)For the problem of information interaction between labels and text,a bidirectional attention mechanism based on label embedding is introduced.Considering the importance of label information in multi-label text classification,this thesis uses the R_Transformer model to capture global dependencies and local structure information in text sequences,and later,with the help of label embedding,weights the text hidden vector using the attention from label to token-level text representations,and also obtains text-aware label representations using the attention from sequence-level text representations to labels.The mechanism was evaluated on RCV1 and AAPD datasets,and the experimental results showed that the bidirectional attention mechanism is effective for solving multi-label text classification tasks.(3)A new label-specific dynamic graph convolutional network is proposed for the shortage of being unable to distinguish similar labels.In this paper,we model text sequences using convolutional operations and bidirectional long short-term memory networks,and then obtain label-specific semantic representations using a label-attention mechanism that explicitly considers the semantic part of the document corresponding to each label.Also guided by global label cooccurrence information and local label reconstruction graphs,adaptive interactions between labelspecific semantic components are fully explored using an improved dynamic graph convolutional network.Extensive experiments were conducted on RCV1,AAPD and EUR-Lex datasets,and the models could achieve 96.92%,86.30% and 81.42% in P@1 metric,respectively,and showed significant advantages in handling tail labels. |