Font Size: a A A

A Study On Efficient Named Entity Recognition Approach Based On DBpedia Spotlight

Posted on:2017-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y X FuFull Text:PDF
GTID:2348330512480395Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of Linked Data,abundant knowledge base from various fields have been published on the Web in the form of RDF.As a sub-task of Information Extraction,named entity recognition can build a bridge between the knowledge base and nature language,support many tasks like keyword extraction,machine translation,topic detection and tracking.Therefore,how to improve the performance of named entity recognition becomes the focus of most research work.This thesis proposes an optimization framework of named entity recognition based on DBpedia Spotlight.Firstly,we design a framework for editing model to improve the flexibility of system,put forwards methods to use training and candidate data expanding the model,and verify it via artificial data;Secondly,we put forward the rate of Pointwise Mutual Information,by using it to do the feature selection on context model,the space reduces in a large scale but the precision and recall are both improved;Finally,we make use of the hyperlink between Wikipedia articles to construct a topic vector,then calculate the similarity between text and entities in candidate set to do second disambiguation,it improve the annotation results of the system further.In addition,Chinese has the most users in the world,so it is necessary to implement a Chinese named entity recognition system.We also use DBpedia Spotlight as our baseline,consider the specific characteristics of Chinese language and the challenges comes with it,build a language model from Chinese Wikipedia,design and complete a Chinese named entity recognition system to provide REST service and Web interface for users so that it can fill in gaps of Chinese named entity recognition work.In summary,we propose an optimization framework of named entity recognition.Experimental results show that it can increase the flexibility of system,reduce the space of model and improve the precision of annotations.Moreover,we overcome the challenges of Chinese language,design and implement a Chinese named entity recognition system.This work has a certain progressing significance for the research of named entity recognition.
Keywords/Search Tags:Named Entity Recognition, Linked Data, DBpedia, Point Mutual Information
PDF Full Text Request
Related items