Font Size: a A A

The Research And Realization Of Text Information Extraction Based On Ontology Applied In Intelligent Information Retrieval

Posted on:2010-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:L YuanFull Text:PDF
GTID:2178360272499443Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Information Extraction is a technology that grows out of the rapid growth of the Internet and aims at extracting factual information to help people overcome the problem of information overloading. Ontology origins from philosophy and was introduced into Artificial Intelligence later. A well-designed domain ontology can be used as a foundation for knowledge presentation of a domain.Through studying lots of related works on Ontology and Information Extraction, this paper combines the two technologies and proposes an information extracting algorithm by using OWL-based domain ontology. In the algorithm, ontology is viewed as the knowledge frame of a domain and the texts to be extracted are viewed as unstructured instances of the frame. The aim of the algorithm is to extract structured instances of the frame which should be composed of OWL Ontology's semantic elements such as classes, properties and individuals.The algorithm contains five sub-algorithms. They are "Visible Class and Individual Extracting Algorithm", "Invisible Class Extracting Algorithm", "ObjectProperty Extracting Algorithm", "DataType Property Extracting Algorithm" and "Field Instance Division Algorithm". The core ideas of the five algorithms derive from intensive study on components of OWL Ontology and accurate analysis of the structures of OWL Ontology.After the information extracting algorithm's presentation, an information extraction system named TIEBOO is designed and built. It consists of five parts. They are knowledge base, ontology parsing module, text pre-processing module, semantic extraction module and result storage module. Experiments show that the results are accurate with the five modules working cooperatively.At last, the paper makes an analysis of the differences and connection between information extraction and information retrieval, presents a system structure of applying the information extraction system TIEBOO to information retrieval system and debates the advantages of TIEBOO in improve the performance of information retrieval system. This paper shows how to improve the performance of information retrieval system from a brand-new angle and provide a sound reference for building intelligent information retrieval system.
Keywords/Search Tags:Ontology, OWL, Information Extraction, Information Retrieval
PDF Full Text Request
Related items