Font Size: a A A

Identification of entity mentions in text and their coreference resolution

Posted on:2007-01-29Degree:M.SType:Thesis
University:The University of Texas at DallasCandidate:Nicolae, CristinaFull Text:PDF
GTID:2448390005461580Subject:Computer Science
Abstract/Summary:
Detecting the entities in a text is a very important part of the understanding of the text. Entities represent the main concepts of the discourse, what the document ""is all about"". Without knowing these, the text is just a succession of words. The task of detecting entities has applications in many Natural Language Processing domains, like Machine Translation, Summarization, Information Retrieval and Question Answering---in all of which a thorough understanding of the conceptual structure of discourse is vital. This thesis proposes a method for detecting entities and their mentions in natural language text. The work is divided into two successive steps: a method for detecting all the mentions in a text, and a method for grouping these mentions together into classes that refer to the same entity. The novelties introduced in this thesis are the use of the semantic hierarchies of the WordNet lexical database to detect the entity types of nominal mentions, and a top-down, graph-based approach to clustering together mentions that refer to the same entity. It is shown that the second step benefits from the results of the first step, and that the entire system is competitive in terms of performance with the best ranked systems in the scientific community.
Keywords/Search Tags:Text, Mentions, Entity, Entities
Related items