Multi-label text classification is an important task in the field of natural language processing and has been widely applied in areas such as label recommendation,information retrieval,and user comment analysis.This thesis is based on deep learning methods and analyzes several difficulties and challenges that currently exist in multi-label text classification tasks.Combining existing research results,two multi-label text classification models are proposed.1.A multi-label text classification model based on graph convolutional network and cross-attention is proposed.In traditional multi-label text classification models,the semantic information and relationships between labels are usually not considered,leading to poor classification performance.To address this problem,the model first constructs a label relationship graph based on label co-occurrence information as prior knowledge,and implicitly learns the relationships between labels by training a graph convolutional network to optimize label feature representation.Then,cross-attention is used to perceive the semantic relationship between labels and text,obtaining text representation features with label semantics.At the same time,a structured self-attention mechanism is introduced to optimize document text representation in text feature modeling.Finally,an adaptive feature fusion strategy is established based on gate mechanisms,extracting relatively important feature information from label text representation and document text representation for fusion,thus obtaining more comprehensive text representation features.The model is compared with multiple benchmark models on two public text datasets,and the results show that the model has good classification performance.2.A multi-label text classification model based on the integration of global and local information is proposed.In multi-label text classification tasks,traditional methods often suffer from the problem of poor semantic feature extraction ability of text,and the data labels usually have a long-tail distribution.To improve the model’s understanding and extraction ability of text information,a self-attention mechanism and a convolutional neural network are integrated to perceive both global and local information of text sequences.The model first extracts the global contextual information of the text based on a multi-head self-attention mechanism,and then introduces it into a convolutional neural network to enhance the understanding of local text information.This method integrates both the global information and local features of text,improving the model’s ability to understand high-level semantic information of text.Meanwhile,the Cor Net network is introduced to learn the correlation knowledge between labels and enhance the classification prediction results.To address the problem of long-tailed label distribution,the model uses an improved Focal Loss function,which increases the training loss weight of tail label samples to enhance the model’s classification ability in the case of uneven label sample distribution.Through a series of comparative experiments with benchmark models,it has been verified that the classification ability of the model has been improved to a certain extent. |