Font Size: a A A

Research On Chinese Person Name Disambiguation Based On Knowledge Graph Embedding

Posted on:2021-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y M LiFull Text:PDF
GTID:2518306290994589Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
The issue of person name ambiguity widely exists around the world,which brings a huge challenge to distinguish the object represented by the name in texts.The task of person name disambiguation aims to remove the ambiguity of person name in texts,so that the name corresponds to the correct person.The clustering method is suitable for large-scale documents,however,in the era of big data,the cost of re-clustering and results analysis for endless new information is relatively large.Entity-link oriented name disambiguation method directly matches the name to the given entity in the knowledge base,which using the text context and basic attributes and the social network of the person.Most most of existing methods manually build the basic attributes and social network features.In addition,due to the complexity of Chinese and Chinese person names,the disambiguation of Chinese person names is more challenging than that of English.In this paper,we propose a method of modeling person information using knowledge graph embedding model based on Chinese open knowledge graphs.The Chinese open KGs have high-quality structured knowledge,large number of the attributes and social network information of persons.We directly extract sub-graph from it and then pretrain the KGE model instead of manually extracting features.We use triple extraction model to extract constructed semantic information from free texts which also used to pretrain the KGE models.This paper proposes a Chinese person name disambiguation method based on KGE model using a combination of CNN and Ranking SVM.CNN is used to extract combined representation of the person name and its corresponding text based on output of two pretrained KGE models,and then use Ranking SVM method to sort candidate person names.It combines the advantages of CNN in feature extraction and Ranking SVM in ranking problems.We conduct experiments on the CLP Chinese name disambiguation task dataset,and designs three types of experiments,which proves the effectiveness of introducing Chinese open knowledge graph and KGE model to improve the performance of Chinese name disambiguation.
Keywords/Search Tags:Chinese Person Name Disambiguation, Knowledge Graph Embedding, Entity Linking, Entity Disambiguation, Knowledge Graph
PDF Full Text Request
Related items