Font Size: a A A

Chinese Entity Disambiguation Based On Convolutional Neural Networks

Posted on:2018-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y GaoFull Text:PDF
GTID:2348330512997197Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularity of mobile Internet,WeiBo?blog?tieba?BBS?many news sites and government work sites greatly convenience to people's live.These platforms produce a large number of data at all times,and these data contains great value.But most of them are in the form of unstructured or semi-structured,resulting in the existence of a large number of ambiguity in these data.It presents more challenges to Natural Language Processing technology on how to use these data.In this context,Chinese word sense disambiguation and entity disambiguation come into being.At present,most of the mainstream entity disambiguation algorithms are based on the bag of words model,and due to the inherent limitations of the bag of words model,these algorithms cannot make full use of the semantic information of the context.In this paper,a Chinese entity disambiguation algorithm based on convolution neural network is proposed to overcome the shortcoming of the bag of words model which is difficult to capture context the semantic information.The work of this paper mainly includes the following several parts:(1)Because the bag of words model is difficult to be used to describe the semantic description of the context,we design a algorithm based on convolutional neural network to obtain the semantic information of entity context.The word vector matrix of noun in the context of entity is used as the input of the neural network,and then generate entity semantic feature vector of the context by the convolution operation;(2)Based on semantic feature vector,In the stage of model training,to maximize the similarity between the disambiguating entity and the real target entity and the similarity of the candidate entity to be randomly selected as the training target,adjust the parameters of the model.In the stage of model prediction,The knowledge base candidate with the largest similarity is taken as the final target entity;(3)This paper has been experimented on the data set of "Chinese name disambiguation"task that published by The Second CIPS-SIGHAN Joint Conference on Chinese Language Processing(CLP-2012).The experimental results show that the method of Chinese entity disambiguation proposed in this paper is feasible and effective.
Keywords/Search Tags:Entity Disambiguation, CNN, Word Embedding, Semantic Representation
PDF Full Text Request
Related items