Font Size: a A A

Extreme Short Text Classification Based On Knowledge Graph Features Extension

Posted on:2024-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:X K ZhouFull Text:PDF
GTID:2568306914994409Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the swift advancement of web services,more and more extreme short texts such as news headlines appear on the Internet,and users are facing the problem of information overload.The short text classification system can effectively help users filter data.Most existing short text classification methods can roughly be divided into two categories:sole source and external knowledge.Among them,the method based on external knowledge has achieved remarkable results.However,the method based on external knowledge has limitations when applied directly to the extreme short text,without considering the characteristics of the extreme short text,and the model simply integrates all external knowledge into the model.In addition,when faced with a small number of training samples,the expected effect cannot be achieved.In order to solve the above problems,we propose three new methods based on external knowledge classification method.Major contributions are listed as followed:(1)We present an extreme short text classification method based on keyword screening and attention mechanism,called KSAM.This model can effectively solve the problem that the classification result of extreme short text is determined by one or two keywords,and introduce the related concepts of keywords through the knowledge graph to expand the features,effectively solving the problem of sparse features of extreme short text.At the same time,the attention mechanism module can make the model pay more attention to keywords and concepts that play an important role in classification,so that data can be used efficiently.(2)We present a few-shot extreme short text classification method based on knowledge graph and prompt learning,called PLST.The classification methods based on external knowledge need a lot of training data,which will cost a lot.The method based on prompt learning has achieved great results in the case of few samples.Considering the characteristics of very short texts,we extend the verbalizer in prompt learning through knowledge graph to improve the classification effect.The experimental results on datasets show the effectiveness of this method.(3)We propose a method based on prompt tuning with updated verbalizer for short text streams,called PUV.In the previous work,the expansion of the verbalizer through the knowledge graph is static,and there are limitations on the short text data streams.In order to improve the performance,a method based on updated verbalizer is proposed,and on the basis of previous work,the verbalizer is extended by additional methods.The experimental results on datasets verify the effectiveness of the algorithm.KSAM is proposed to solve the problem that the classification result of extreme short text is determined by one or two keywords and the data is sparse.On this basis,PLST is proposed to solve this problem because collecting a large number of labeled training data will cost a lot.Finally,PLST has limitations in the classification of extreme short text streams.Based on PLST,PUV is proposed to solve the classification problem of extreme short text streams.
Keywords/Search Tags:Extreme short text classification, Feature extension, Attention mechanism, Prompt learning
PDF Full Text Request
Related items