Research On Multi-Label Text Categorization Based On Label Embedding Information

Posted on:2024-02-08

Degree:Master

Type:Thesis

Country:China

Candidate:P Zhou

Full Text:PDF

GTID:2568307058471854

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

As an important subtask of natural language processing,text classification is widely used in fields such as sentiment classification,topic recognition,and question answering systems.In recent years,significant achievements have been made in single-label text classification.However,traditional methods for multi-label text classification have not performed well due to factors such as complex label relationships and difficulties in modeling text and labels.In multi-label text classification,learning label correlation and the interaction between labels and text are the focus of research.This thesis studies multi-label text classification using attention mechanisms and graph convolutional networks combined with label embedding information:(1)This thesis proposes a Multi-Scale Cross Attention(MSCA)method to address the problem of the text-label relationship matrix not considering the interaction between text and labels at different scales.Starting from studying the relationship between labels and text,a relationship matrix of label and text features is constructed.Then,variable windows are used to extract cross attention at different scales to mine the text-label features.Based on this,a Multi-Label Text Classification method based on Multi-Scale Cross Attention(MLTC-MSCA)is proposed,which utilizes both MSCA and self-attention to extract text features,and then adaptively fuses the two kinds of features to obtain a final mixed semantic feature representation.Experimental results show that the proposed method outperforms other multi-label classification methods in terms of Micro-F1 values on the AAPD dataset and the RCV1-V2 dataset,with improvements of 1.1% and 0.6%,respectively,demonstrating the effectiveness of the proposed method.(2)To address the issue that previous multi-label classification methods did not jointly consider multiple connections between labels and text,this thesis proposes a Multi-Label Text Classification method based on Adaptive Heterogeneous Graph Convolution Networks(MLTC-AHGCN).The MLTC-AHGCN model constructs three layers of text representation using a hierarchical structure and learns the multiple relationships between labels and text through an improved adaptive heterogeneous graph convolution.By fusing the three layers of text representation,it jointly considers word relationships,text-label interactions,and label relevance.To verify the effectiveness of the MLTC-AHGCN method,experiments were conducted on the AAPD and RCV1-V2 datasets and compared with other multi-label text classification methods.Experimental results show that the proposed method improves the Micro-F1 value by 1.9% and 0.9% on the two datasets,respectively,compared with other methods,indicating the effectiveness of the proposed method.

Keywords/Search Tags:

Multi-label text classification, Attention mechanism, Multi-scale, Label semantics, Graph convolution

PDF Full Text Request

Related items

1	Research On Multi-label Text Classification Based On Text And Label Representation Optimization
2	Research On Text Multi Label Classification Algorithm Based On Self-attention Mechanism And Graph Convolution Network
3	Multi-label Text Classification Based On BERT And Label Attention Mechanism
4	Research On Multi-Label Text Classification Based On Deep Learning
5	Research On Multi-label Text Classification Methods Based On Attention Mechanism
6	Research On Multi-Label Image Classification Algorithm Based On Graph Convolution Network
7	Research On Feature Extraction Of Multi-label Text Classification
8	Research And Application Of Multi-label Classification Algorithm Based On Deep Learning
9	Research On Multi-label Image Classification Based On Multiple Graph Convolution
10	Research On Multi-label Text Classification By Integrating Label Informatio