Font Size: a A A

Text mining for semantic relations

Posted on:2003-10-02Degree:Ph.DType:Thesis
University:The University of Texas at DallasCandidate:Girju, Corina RoxanaFull Text:PDF
GTID:2468390011483111Subject:Computer Science
Abstract/Summary:
Text Mining is a rapidly emerging field concerned with the extraction of concepts, relations, and implicit knowledge from texts. The current state-of-the-art in Text Mining is based on shallow representations of text documents coupled with statistical data mining techniques. This approach is limited due to the highly ambiguous nature of natural language.; This thesis proposes a new approach to Text Mining that emphasizes the use of rich syntactic and semantic features to discover useful and implicit relations from text and that is based on the acquisition of some of the most frequently used semantic relations. Using a general algorithm, the system discovers automatically lexico-syntactic patterns for each semantic relation considered. The patterns are evaluated and accepted or rejected based on some semantic constraints specifically tailored for each semantic relation. These semantic constraints are rooted in the WordNet lexical database.; We have focused on two specific semantic relations widely used: CAUSALITY and PART-WHOLE relations. A text knowledge acquisition (KAT) system was developed to extract lexical-syntactic patterns that refer to these semantic relations.; The knowledge discovered, concepts and semantic relations, is organized into hierarchies for the purpose of developing ontologies. These ontologies are built using a knowledge classification approach based on subsumption.; In this thesis we also demonstrate the usefulness of Text Mining for advanced Natural Language applications, such as Question Answering. On-line ontology development helps understand complex questions and provides the means for Answer Fusion.
Keywords/Search Tags:Text mining, Relations
Related items