Font Size: a A A

Relation Extraction Among Named Entities By Detecting Communities In Network Structure

Posted on:2007-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2178360182988953Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Relation extraction among named entities (NR) is a major task in the research of information extraction (IE). The goal of NR is to find relationships between two named entities from text contents. Recently, it has received more and more attention in many areas, e.g., information extraction, ontology construction, Question-Answering system, and bioinformatics, etc.Since the concept of relation extraction was introduced in MUC 6 (Defense Advanced Research Projects Agency, 1995), there has been considerable work on supervised learning of relation patterns, using corpora which have been annotated to indicate the information to be extracted. However, manually tagging of large amounts of training data is very time-consuming;furthermore, it is difficult for one extraction system to be ported to another one. Due to the limitations above, unsupervised approaches were put forward. For the unsupervised approach, we noticed some problems. (1) The named entity pairs are always described by the context. However, there is no proper method to define the size of the context window;(2) How to get a good result of clustering in the case of noise existing;(3) How to name the relations among named entities with hierarchy.In this paper, we proposed an unsupervised method for relation extraction. The method draws on the latest fruits in the field of networked data mining. Three key technologies are adopted in this method: (1) Network representations of named entity pairs;(2) Entity pairs clustering by discovering communities;(3) Relation description based on hierarchy. Especially, we solve the problem of defining the size of context window in key technology one and discovering the communities in a weighted network in key technology two.Our experiments using half year of newspapers reveals not only that the relations among named entities could be detected with high precision, but also that appropriate labels could be automatically provided for the relations.
Keywords/Search Tags:information extraction, networked data mining, named entity pair, relation, clustering, betweenness
PDF Full Text Request
Related items