Research And Implementation Of Ontology-Based Web Information Extraction System

Posted on:2008-05-01

Degree:Master

Type:Thesis

Country:China

Candidate:W Zhao

Full Text:PDF

GTID:2178360248952206

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of Internet, the Internet has become one of the most important knowledge repositories. It is highly desirable to achieve efficient information extraction. It has become an important research issue of how to offer efficient information automatically from Internet to the users. The information extracted by IE (Information Extraction) systems not only can provide for the end user, but also is the first step to build an intelligent query system and a data mining system. The IE system has a nice prospect, and the research on IE technique becomes the focus of Natural Language Processing internationally.In this paper, it first introduce the Information Extraction technology and its developing background and history. It analyse the system architecture, the taxonomy of Information Extraction and the key technology and weighing measure of Information Extraction. And this paper also introduce the basic knowledge of ontology.Based on this, this paper present a new approach to extracting information from normal document based on an application ontology that describes a domain of interest. In this approach it combine the Information Extraction with ontology. It first use the concepts, relations and keywords of domain ontology to generate Information Extraction rule automatically and then do grammar parsing on the document. After that it use the result of grammar parsing and Information Extraction rule to do information extraction on document and at last output the result as a list of records.In this paper, according to the approach and engineering reality condition, it designed an Ontology-based Web Information Extracton System and wrote some codes and implemented the system, so in this paper, it introduce the main frame and the designing method of main modals in detail. Because this system use the ontology to extract information, so this paper focus on how to parse OWL ontology with DOM and a new ontology storage schema is designed according to characteristics of OWL ontology's Classes and Properties.This paper also introduce the way of implementing the system which includes data structure, flow chart etc. At last this paper show the result which it got from the processing of this system using some test documents and analyse the extraction result.

Keywords/Search Tags:

Information Extraction, ontology, OWL

PDF Full Text Request

Related items

1	Ontology-Based Structured Information Extraction From Web Pages
2	Research On Ontology Evolution In Open Environments And Its Application In Information Extraction
3	Research On Ontology-Based Web Information Extraction Technology
4	Research On Semi-automatic Construction Of Application Ontology Based On Chinese UGC Information Source
5	Research On Ontology-based Product Information Extraction System
6	Research On Web Product Indicator Extraction Based On Ontology
7	Extraction Technology Research, Based On Ontology Can Be Customized Web Information Intelligence
8	Research Of Ontology-Based Information Extraction
9	Based On Ontology Stock Information Extraction System
10	Research On Web Information Extraction Technology Based On Ontology Of Petroleum Domain