Font Size: a A A

Design And Implementation Of Entity Linking Algorithm Based On Knowledge Graph

Posted on:2022-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiuFull Text:PDF
GTID:2518306572497264Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Big Data and AI technology,Knowledge Graph has been fully developed and utilized.As an important part in the field of Knowledge Graph,Entity Linking tasks have attracted the attention of researchers in academia and industry currently.Due to the ambiguity of natural language,it is a huge challenge to accurately extract and identify the true meaning expressed by users.Therefore,Entity Linking technology plays an important role in this challenge.Based on the research of Knowledge Graph and Entity Linking technology,this paper designs a practical Entity Linking algorithm system based on Knowledge Graph.For the datasets,this paper selects PKU-PIE(Peking University Chinese Encyclopedia Knowledge Graph)as the environmental Knowledge Graph,the data in CCKS 2019 Task 6 as the environmental data,and the Neo4 j platform as the environmental database.For the system design,this paper introduces the pipeline design pattern.The main modules of the system include text preprocessing module,entity mention extraction module,attribute word extraction module and entity link module.Among them,the text preprocessing module mainly realizes operations such as full angle to half angle,uppercase to lowercase,traditional to simplified and punctuation filtering;the entity mention extraction module is responsible for extracting the entity mention words of the query after preprocessing,including the AC Tree submodule,BiLSTM-CRF submodule and conflict resolution submodule;the attribute word extraction module mainly extracts specific types of attribute words for input query,including the word between “ ” and ??,date words,pure number words,number + unit words;the entity linking module obtains the corresponding Top N candidate entities according to the output of the entity mention extraction module,mainly including the candidate entities generation submodule,entity feature extraction submodule and the entity sorting submodule.The system includes four modules: text preprocessing module,entity mention extraction module,attribute word extraction module and entity link module,mainly involving the core technologies of text preprocessing,AC tree search algorithm,BiLSTMCRF model,entity feature engineering,GBDT algorithm and so on.Finally,the experimental results show that the Top5 entity recall rate on the test set can reach 0.85.In terms of performance,it can guarantee the speed of 600ms/sentence,which meets the expected needs and actual application scenarios.
Keywords/Search Tags:Entity Linking, Knowledge Graph, BiLSTM-CRF model, Knowledge-Embedded, GBDT classifier
PDF Full Text Request
Related items