Research On Label-aware Text Classification Methods

Posted on:2022-03-24

Degree:Master

Type:Thesis

Country:China

Candidate:X Huang

Full Text:PDF

GTID:2518306563980219

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Text classification is a classical problem in natural language processing(NLP),which aims to assign labels or tags to textual units such as sentences,queries and documents.This task can be divided into multi-class text classification and multi-label text classification where the latter allows for the co-existence of more than one labels for a single document.Text classification has a wide range of applications including question answering,spam detection,sentiment analysis,news categorization and so on.The heart of the matter lies in the quality of representation learned from the input document.A good representation should focus on learning global contextual information as well as local discriminative features since the former provides general information for coarse-grained matching while the latter offers specific clues for fine-grained recognition,both are important for classification.However,most existing methods focus on determining a single representation for one input document,which is hard to sufficiently preserve the essential content and benefit the subsequent learning task.In addition,the multiple labels are usually correlated semantically,and it is beneficial for the multi-label learning process to exploit the correlation among different labels.In this paper,we propose two methods to improve the performance of text classification by exploiting both document content and label correlation.The main work and contributions are as follows:Firstly,we propose an explicit label-aware representation for each document with a hybrid attention deep neural network model(LAHA).LAHA consists of three parts.The first part adopts a multi-label self-attention mechanism to detect the contribution of each word to labels.The second part exploits the label structure and document content to determine the semantic connection between words and labels in a same latent space.An adaptive fusion strategy is designed in the third part to obtain the final label-aware document representation so that the essence of previous two parts can be sufficiently integrated.Extensive experiments have been conducted on six benchmark datasets by comparing with the state-of-the-art methods.The results show the superiority of our proposed LAHA method.Secondly,we propose a label-aware comprehensive representation learning method(La CRL).For text classification,both global contextual information and local discriminative features are important since the former provides general information for coarse-grained matching while the latter offers specific clues for fine-grained recognition.However,most existing methods focus on determining a single representation for one input document,which is hard to sufficiently preserve the essential content and benefit the subsequent learning task.La CRL aims to simultaneously capture the coarse-grained and fine-grained information with a well-designed joint optimization strategy.Specifically,the global and local representation learning are jointly optimized so that they can be seamlessly affected by each other.Extensive experimental results on well-known benchmark datasets have shown the efficacy of our approach by comparing with the stateof-the-art methods.

Keywords/Search Tags:

Text Classification, Deep Neural Network, Attention Mechanism, Text Representation

PDF Full Text Request

Related items

1	Text Representation And Classification Based On Deep Learning With Improved Attention Mechanism
2	Research On Text Representation And Classification Based On Neural Networks And Self-attention Mechanism
3	A Study On Hierarchical Text Representation And Sentiment Classification
4	Research On Text Classification Model Based On BGRU And Self-Attention Mechanism
5	Text Classification Research Based On Deep Neural Network And Attention Mechanism
6	Research On Classification Of News Text Based On Deep Learning
7	Research On Text Classification Model Based On Deep Learning And Attention Mechanism
8	Text Sentiment Classification Based On Attention Mechanism And Fusion Of Neural Networks
9	Research And Application Of Chinese Short Text Classification Algorithm Based On Deep Learning
10	Deep Neural Networks For Text Representation And Classification