Font Size: a A A

The Application Of Label Embedding In Text Classification

Posted on:2021-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:C XiaoFull Text:PDF
GTID:2428330602499092Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the advent of information age,the Internet platform has produced a large number of text information resources.These information resources contain great busi-ness value.However,the organization,management and mining of these information resources has always been an important issue in industry and academia.Automatic Text Classification(ATC)technology has been considered as an effective means to manage these resources.The traditional text classification models mostly utilize the one-hot coding method to encode labels.The label representations produced by one-hot coding are merely sym-bols and they do not contain any semantic information.Thus,the information contained in labels is poorly explored.In order to solve this problem,researchers try to learn em-beddings for each label in the corpus.This dissertation mainly studies the learning and application of label embeddings in text classification tasks.Firstly,we propose a novel approach for learning label embeddings based on associating input texts and labels.Secondly,we will adopt Graph Convolutional Neural Network(GCN)to learn label embeddings and use these embeddings to facilitate the classification process.Previous label embedding learning approaches mostly use the side information to learn embedding vectors for labels,such as label description text,label attributes.How-ever,it is usually expensive to acquire these side information.In this dissertation,we try to use input text as the context information of labels.Then,the distributed hypothesis of label is proposed to model the explicit and implicit correlations among labels.Finally,the words and labels are embedded into a common vector space to correlate each other.The experimental results suggest that using input text as label context can help model learn high-quality label embeddings.We note that Graph Convolutional Neural Network has attracted attentions for its outstanding ability in capturing global associations.We utilize GCN to learn label em-beddings and use them to improve the classification performance.We note that previous GCN models build a huge graph that contains training documents,test documents and vocabularies.This tightly coupled approach will make the model consume lots of mem-ories and be unfriendly to new-coming text instances.In order to solve this problem,we first propose a Loosely Coupled Graph Convolutional Neural Network(LCGCN).The model decomposes corpus into core and secondary parts,which makes the model loosely coupled.Then,the model is utilized to learn embedding vectors for vocabular-ies and labels simultaneously.The learned label embeddings are adopted to facilitate the classification process.Extensive experimental results on real-world data sets sug-gest that our loosely coupled model can effectively alleviate the problems of the tightly coupled model.These results also show that considering label embeddings can improve the classification accuracy by about 1 percentage point.
Keywords/Search Tags:Text Classification, Label Embedding, Graph Convolutional Neural Net-work
PDF Full Text Request
Related items