As an important subtask of natural language processing,text classification is widely used in fields such as sentiment classification,topic recognition,and question answering systems.In recent years,significant achievements have been made in single-label text classification.However,traditional methods for multi-label text classification have not performed well due to factors such as complex label relationships and difficulties in modeling text and labels.In multi-label text classification,learning label correlation and the interaction between labels and text are the focus of research.This thesis studies multi-label text classification using attention mechanisms and graph convolutional networks combined with label embedding information:(1)This thesis proposes a Multi-Scale Cross Attention(MSCA)method to address the problem of the text-label relationship matrix not considering the interaction between text and labels at different scales.Starting from studying the relationship between labels and text,a relationship matrix of label and text features is constructed.Then,variable windows are used to extract cross attention at different scales to mine the text-label features.Based on this,a Multi-Label Text Classification method based on Multi-Scale Cross Attention(MLTC-MSCA)is proposed,which utilizes both MSCA and self-attention to extract text features,and then adaptively fuses the two kinds of features to obtain a final mixed semantic feature representation.Experimental results show that the proposed method outperforms other multi-label classification methods in terms of Micro-F1 values on the AAPD dataset and the RCV1-V2 dataset,with improvements of 1.1% and 0.6%,respectively,demonstrating the effectiveness of the proposed method.(2)To address the issue that previous multi-label classification methods did not jointly consider multiple connections between labels and text,this thesis proposes a Multi-Label Text Classification method based on Adaptive Heterogeneous Graph Convolution Networks(MLTC-AHGCN).The MLTC-AHGCN model constructs three layers of text representation using a hierarchical structure and learns the multiple relationships between labels and text through an improved adaptive heterogeneous graph convolution.By fusing the three layers of text representation,it jointly considers word relationships,text-label interactions,and label relevance.To verify the effectiveness of the MLTC-AHGCN method,experiments were conducted on the AAPD and RCV1-V2 datasets and compared with other multi-label text classification methods.Experimental results show that the proposed method improves the Micro-F1 value by 1.9% and 0.9% on the two datasets,respectively,compared with other methods,indicating the effectiveness of the proposed method. |