Research On Web Information Extraction Based On Domain Ontology

Posted on:2010-10-10

Degree:Master

Type:Thesis

Country:China

Candidate:D R Kong

Full Text:PDF

GTID:2178360275989376

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

As the development of internet, the network information constantly increase which has already become an important source for people obtain information. However, it is difficulty for people to accurately search information truly wanted on the web due to the structurelessness of web page, the diversity of web content, and dynamic change of webpage. While Web information extraction technologies provide a powerful information acquisition tool, and it make the information expressed by various format be transformed into uniform expression, which solve various problems in web page.The producing background, technical connotation and basic application of information extraction are firstly introduced in this paper. Meanwhile the architecture, key technology and measurement index of information extraction are analyzed. Then importantly introduce the basic knowledge of Ontology including construction and analytic. On the basis of this we propose a web information extraction method based on the domain ontology. This method one hand automatically generate matching model by using the concept, attribute, hierarchical relation of domain Ontology, on the other hand the syntax analysis to text which obtained by web page pretreatment is enforcement, and with the extraction rules carry on the information extraction to the text, last the extraction result output to database in the form of record order to query. The most advantage of information extraction based on domain Ontology is non dependence to the structure of web page. Besides the knowledge database that information extraction is described and expressed by ontology increase the semantic expression capability of extraction model, greatly improve the accuracy of information extraction by storage the important of information extraction into special domain.According to the method above and combining with practical, we design and implement an information extraction system about Computer job hunting letter ontology. The general framework and main modules is described in detail. The concept, attribute, hierarchical relation obtained by analytic Ontology is used for constructing the Ontology model tree, and make the unstructured text obtained by pretreatment carry out information extraction to the awaiting extracting object according to the structure of Ontology model tree.Last introduce and analyze the experience.

Keywords/Search Tags:

Domain Ontology, Information Extraction, Matching

PDF Full Text Request

Related items

1	Adaptive Web Information Extraction Method Research Based On Ontology
2	Ontology-Based Structured Information Extraction From Web Pages
3	Domain Ontology-based Web Information Extraction Technology
4	A Research On Chinese Information Extraction Based On Construction Of Domain Ontology
5	Research On Web Information Extraction Based On Domain Knowledge
6	Research On Related Technologies Of Domain Information Extraction
7	An Ontology-based Domain Information Collection And Its Application
8	Construction And Implementation Of Domain Ontology Based On Plain Text
9	Research On Key Technologies Of Ontology Construction Based On WordNet And Its Application In Security Domain
10	Domain Ontology Construction And Applied Research In The Web Information Extraction