Multi-label Text Classifification Model Based On Correlation-guided Representation

Posted on:2023-12-04

Degree:Master

Type:Thesis

Country:China

Candidate:Q M Zhang

Full Text:PDF

GTID:2568306914460744

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

In multi-label text classifification(MLTC),each given document is associated with a set of correlated labels.In this paper,we study two important problems of MLTC task:weak text representation in multi-label text classification and classification task correlation modeling.We propose joint embedding of text and labels,text-label correlation-guided text representation,and label correlation learning through multi-task framework.Aiming at the problem that the text representation is not discriminative in MLTC task,this paper compares whether to introduce label information and different types of embedding methods(joint embedding and non-joint embedding)through experiments,and proposes a joint embedding strategy to implicitly capture the correlation between text-label and label-label while reducing the model’s dependence on label description information.Since capturing text-label correlation plays a key role in the acquisition of text representations,after obtaining the joint embedding of text and labels,this paper further proposes to utilize a two-stage attention mechanism(selfattention network and text-label cross-attention)network to obtain the correlation matrix to explicitly model the correlation between text and labels,and then obtain a differential weighted global text representation.Due to the poor predictive ability of models on low-frequency labels in MLTC tasks,to address this common problem,previous classifier chains and Seq2Seq models both transformed the task into a sequence prediction task and solved it by modeling label correlations.However,the above models tend to suffer from label order dependencies,label combination overfitting and error propagation issues.To avoid the above problems,this paper proposes two auxiliary label co-occurrence prediction tasks to enhance label correlation learning,strengthen the modeling of label correlation,and further alleviate the long-tail problem.This model achieves better performance on public datasets,trains faster than other Seq2Seq-based models,and fits better label combinations.Finally,we design and implement the MLTC system.Based on the model MT-LACO proposed in this paper,the input text to be classified is analyzed,the relevant labels are predicted,and the classification results are output.

Keywords/Search Tags:

multi-label text classification, text representation, label correlation, attention network, multi-task learning

PDF Full Text Request

Related items

1	Research On Multi-Label Text Classification Based On Deep Learning
2	Research On Multi-label Text Classification Based On Text And Label Representation Optimization
3	Research On Multi-Label Text Classification Based On Deep Learning
4	Research And Implementation On Text Classification In Vertical Domain
5	Research On The Improvement Of Multi-label Text Classification Algorithm For Offensive Language In Social Media
6	Research On Multi-Label Text Categorization Based On Label Embedding Information
7	Research On Feature Extraction Of Multi-label Text Classification
8	Research On Text Multi-label Classification Algorithm Based On Label Correlation
9	Research On Text Multi-Label Classification Based On Heterogeneous Graph Attention Network
10	Research On Multi-label Text Classification Methods Based On Attention Mechanism