Font Size: a A A

Research On Entity Linking Algorithm Based On Weakly Supervised Learning And Entity Network

Posted on:2021-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y J WuFull Text:PDF
GTID:2518306569497604Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the vigorous development of the information age,a lot of text information needs to be collected and analyzed.However,traditional entity linking methods usually require a lot of manual annotation work and the accuracy and efficiency of entity linking algorithms need to be improved.Therefore,the research of entity linking algorithm in the construction of knowledge graph is of great scientific and commercial value.The entity linking task refers to mining the potential entity relationships that appear in human language texts,and linking to entities in the knowledge graph based on the entity relationships.This task solves the problem s of ambiguity(that is,one word with multiple meanings)and diversity(that is,multiple words with one meaning)between entities.Existing research methods include dictionary matching and machine learning methods.The method of dictionary matching requi res a large number of experts to revise.Although the result of the method is accurate,the labor cost is high and the generalization ability is poor.Machine learning methods rely on manual feature construction and selection,but the process of constructi ng features often consumes a lot of manpower.In addition,due to the small size of the data set,the learning ability of a single model often has certain limitations.Therefore,this paper carried in-depth research from two aspects,the entity link algori thm based on weakly supervised learning and the entity link algorithm based on entity network.Aiming at the cost problem caused by a large number of manual annotations,the entity linking algorithm based on weakly supervised learning is studied.In the stage of candidate entity generation,a filtering method based on the Wikipedia link graph is used to obtain a set of candidate entities with a higher recall rate.In the stage of candidate entity disambiguation,the candidate entity set is used as a weak supervision constraint.Considering the relationship between the entity and its local context and the coherence information between entities in a document,the paper uses neural network to realize candidate entity disambiguation.Experiments on 6datasets such as AIDA-A,MSNBC,AQUAINT,ACE2004,WNED-CWEB,etc.show that the above method achieves the same effect as fully supervised learning,and even the F1 value on the AQUAINT dataset exceeds the fully supervised learning model 2 %.Aiming at the problem of improving the accuracy and feature construction of the entity link model,the entity link algorithm based on entity network is studied.In the stage of candidate entity generation,firstly,named entity recognition(NER)technology is introduced,then entity extraction is performed through Bert-Bi LSTM+CRF,and finally more candidate entity sets are obtained through substring matching.In the stage of candidate entity disambiguation,firstly construct a word vector with candidate entity semantic information,candidate entity sentence semantic information and location information,and then input into the fully connected network through splicing,and finally obtain the entity link result through two classification.The experimental results show that the F1 score of 91.32% is achieved on the CCKS dataset,which verifies the effectiveness of the entity link algorithm based on the entity network proposed in this paper.
Keywords/Search Tags:entity link, weakly supervised learning, wikipedia link graph, entity network
PDF Full Text Request
Related items