Font Size: a A A

A Popular Science Text Classification Algorithm Based On Attention Mechanism And Knowledge Graph Enhancement

Posted on:2023-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:W J TangFull Text:PDF
GTID:2557306848958229Subject:artificial intelligence
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and network media and the increasing public demand for science popularization,science popularization workers need to conduct science popularization work according to the "demand orientation".The massive amount of popular science articles has created a great challenge for popular science workers to analyze the needs of popular science,and the informatization of popular science work is imminent.Text classification is an important research direction in information processing and data mining,which can better help popular science workers organize and manage massive amounts of text information and obtain the required information quickly and accurately.However,popular science articles are usually long,and the more commonly used text classification algorithms have some problems when dealing with long texts: convolutional neural networks are better at deep data mining,and it is difficult to capture the global information of long texts;recurrent neural networks are prone to appear Problems such as gradient explosion and gradient disappearance;the pre-training model is limited by the input length,and there is a problem of information loss,and because the training corpus of the pre-training model comes from a comprehensive field,it lacks vertical domain knowledge guidance.Aiming at the ineffectiveness of commonly used text classification algorithms,this paper proposes a popular science text classification algorithm based on attention mechanism and knowledge graph enhancement.The main research work is as follows:(1)Build a knowledge graph in the field of popular science.In order to introduce vertical domain knowledge and provide data support for the follow-up science popularization informatization process,we successfully constructed the science popularization knowledge graph ontology based on unsupervised algorithm results and expert guidance through a semi-automatic ontology construction method.Supervised entity recognition is carried out by BERT + Bi LSTM + CRF model,and semi supervised relationship extraction is carried out by bootstrapping method to achieve knowledge extraction of this knowledge graph.After that,the external data is used to supplement the knowledge graph,and the knowledge triple is obtained through knowledge fusion.The finally constructed knowledge graph of popular science field contains 32130 entities and 12385 entity relationship triples.Finally,the graph database neo4 j is used to realize the persistent storage of knowledge,and a lightweight graph editing platform is built based on the front-end and back-end technologies.(2)Propose a popular science text classification algorithm based on attention mechanism and knowledge graph enhancement.In view of the long sentences and length of popular science articles,it is difficult for the model to focus on key information,resulting in poor classification performance of traditional models.We propose a popular science text classification algorithm that combines knowledge graphs for two-level screening.We use the knowledge graph as the supervision data,train a sentence filter to filter irrelevant information,and use the attention mechanism to further filter the filtered sentence set to realize text classification.The experimental results on the popular science text classification dataset PSCD show that the knowledge-enhanced text classification algorithm model based on knowledge graph has a higher F1-Score.Compared with the Text CNN model and the BERT model,the F1-Score is increased by 2.88 percentage points and1.88 percentage points,which verifies the effectiveness of using knowledge graph prior knowledge to filter long text information.
Keywords/Search Tags:Popular science text classification, Knowledge graph, Semi-automatic ontology construction, Knowledge extraction, Attention mechanism
PDF Full Text Request
Related items