Font Size: a A A

Research On Multi-Label Text Classification With Label-Dependency Information

Posted on:2021-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y L XuFull Text:PDF
GTID:2428330647451062Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and the advent of the era of big data,people are flooded with a large amount of information.Among them,text information is the most commonly and most complicated type of information.In many scenarios,text objects contain ambiguity,such as for news,comments,blogs,etc.Multi-label text classification refer to assigning most relevant label subset to document,which can help people quickly organize and archive documents.This article aims to follow up the cutting-edge work in the field of multi-label text classification,trying to solve some of the problems and challenges that still exist in this field by mining label-dependent information: on the one hand,the existing model methods may not consider the correlation between the labels,or only considering the low-order relationship,or modeling the high-order relationship but the specific approach lacks rationality and feasibility;on the other hand,the existing multi-label attention mechanism is excessively relying on a single word representation when learning the importance of word importance,which may cause problems such as matching error between words and labels.This article mainly considers these two aspects to carry out related research work.The main work of this article is as follows:(1)Aiming at the problem that the existing model does not reasonably model the relationship between the labels,the original loss function is improved according to the label co-occurrence matrix,and a regularizer is designed to mine the dependency between the labels through the loss function.Experimental results show that the method exceeds the existing model in micro-F1 value and other main metrics,and further analy-sis shows that it can exploit label dependency to regularize the proposed model,thereby improving the generalization ability of the model.(2)Aiming at deeper label relationship mining and label representation learning for label dependency modeling,it is proposed to use graph convolution and label graph to update the output layer weights.The experimental results show that this method further improves the micro-F1 value,and further analysis shows that it greatly improves its prediction ability for rare labels without sacrificing the accuracy of prediction on frequent labels.(3)Aiming at the problem that the traditional multi-label attention mechanism relies too much on a single word representation,a hybird attention mechanism is designed to use global semantics to guide the learning of word weights.The experimental results show that this method further improves the micro-F1 value,and more analysis show that it takes more detailed dependency into account and have significant performance improvement on frequent labels.
Keywords/Search Tags:Multi-label Learning, Text Classification, Label-Dependency, Deep Neural Network
PDF Full Text Request
Related items