| With the rapid development of the Internet in recent years,the number of text information has increased rapidly,and the forms of text information have become more and more diverse.Abbreviations,non-standard expressions and nicknames in network texts cause the entity diversity problems,as well as entity ambiguity caused by the ambiguity of natural language itself.Those affect people’s analysis and understanding of information.Entity ambiguity problem means that the same string can represent multiple entities depending on the context,and entity diversity problem means that multiple different strings can represent the same entity.Therefore,solving the problem of entity ambiguity and diversity can help people to understand the text information.The process of entity discovery and linking which links the mention in the text to the corresponding entities in the Knowledge Base according to the mention context information can mainly solve the diversity and ambiguity of entities.It is of great significance to search engine,information understanding and question answering.The task of entity discovery and linking is the process of identifying the mention in the text and linking it to the corresponding entity in the knowledge base.At present,the Entity Linking method has single Entity Linking and collaborative Entity Linking.The single Entity Linking method links one mention in the text each time,focusing on the mention context and the description text of the entity,but it ignores the relationship between the mentions in the text.The collaborative Entity Linking method links all mentions in the text together,focusing on the relationship between mentions and the relationship between entities in the Knowledge Base.However,it ignores the descriptive text information of the entities and the mention text.In order to make up the shortcoming of those methods,we propose An Entity Linking Approach Based on CNN and Random Walk with Restart and implement an entity link system based on this approach.The approach first recognizes mentions in the text,then generates the candidate entity set of the mentions,and then selects the candidate entity using the CNN and Random Walk with Restart.Finally,the mentions without corresponding entities in the knowledge base are clustered.Our approach,the value of FCEAFM on the English dataset of Entity Discovery and Linking task of KBP2016 is 0.669,which is lower than that of the first place of English team(0.015),higher than that of the second place English team(0.019).The generality of the method is verified on the evaluation data sets of Chinese,English and Spanish in the task of entity Discovery and Linking of KBP2016,the FCEAFM value of its experimental results is 0.652,with the first one in the three languages is 0.643.The results show that our method of entity discovery and Linking is useful.The main contributions of this paper are as follows:1)This paper proposes an Entity Discover and Linking Approach Based on CNN and Random Walk with Restart.This approach uses convolution neural network to obtain local information of entities and mentions,and restarts random walk algorithm to obtain global information of entities and mentions.2)For entity link,our use restart random walk to obtain mentions and entity semantic feature,that is,to obtain mentions and entity global information.3)For entity link,our use convolution neural network to obtain the text feature of the context of mentions and description text of entity in the knowledge base,that is,to obtain local information of mentions and entity.4)Build a knowledge base analysis index.Traditional entity retrieval methods use string matching,so the retrieval efficiency is very low.Therefore,we analyze the Knowledge Base and use Elasticsearch to build the Knowledge Base analysis index,and then design and implement a more reasonable entity retrieval strategy. |