Font Size: a A A

Research On Marching-on Short Text Classification Based On Knowledge Graph

Posted on:2021-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:W Q ZhangFull Text:PDF
GTID:2428330623468136Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,short text is widely used in the fields of online news,instant messaging and social media due to its brief,efficient,and spread quickly,then,massive short text data has been generated.How to mining the hidden valuable information from massive short text data has gradually become a hot research issue.However,short text is a kind of short space,less effective information,and serious colloquialization data in the network.Traditional text classification methods cannot obtain expected classification results.This thesis focuses on the problem of feature sparsity and irregular expression in short text classification.The main work contents and innovations are as follows:(1)Summarized the methods and related works of short text classification.At first,the thesis introduces the research background and significance of short text classification,defines the feature sparsity and irregular expression problem in short text classification.Then,focuses on the existing solutions to the above problems,compares the advantages and disadvantages of each method,and summarizes related works of various methods in recent years at home and abroad.(2)Proposed a short text feature expansion method based on knowledge graph.This method aims to solve the problem of feature sparsity,and obtain high-quality knowledge from knowledge graph as feature expansion items to enrich the context features of short texts.Firstly,we extract the words with higher weight in the short text as the key words with the TextRank method.Then,the key words are linked to the entities in the knowledge graph,and the context similarity between the feature word and the candidate entities is used for entity disambiguation to obtain the the target entity.Finally,the target entity and its summary information are used to expand the features of the short text.(3)Proposed a short text classification model combining knowledge graph and deep semantics named BERT-KG.This model aims to solve the problem of irregular expression,improves the BERT pre-training model,obtains a model named BERT-KG,the model can fuse the background knowledge of short text.Using the BERT-KG model to obtain deep semantics of short text with background knowledge.Output the corresponding short text representation vector for short text classification tasks and improve the accuracy of the classification results.(4)Designed and implemented a sensitive short text classification system based on UGC platform.Applied the method and model proposed in this thesis to the real project,build a sensitive short text classification system based on the UGC platform.The system uses the original data provided by the project to generated a data set,then trained a sensitive short text classification model with the data set,desigened and implemented a classification result visualization module.Finally,in order to external calls and system integration easily,we design and implement the interfaces for short texts representation and sensitive short texts classification.
Keywords/Search Tags:Short text classification, Knowledge graph, Feature expansion, Text representation
PDF Full Text Request
Related items