Domain Ontology Construction And Applied Research In The Web Information Extraction

Posted on:2011-10-08

Degree:Master

Type:Thesis

Country:China

Candidate:C Huang

Full Text:PDF

GTID:2208360302970050

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Information extraction is an important development direction of Natural Language Processing skill. The purpose of information extraction is to make useful information together in a uniform manner, so that benefits for helping people access to information. As a natural language processing system, information extraction system requires a strong knowledge library's support. Because in different information extraction system, knowldege library structure and the content are different, which making information extraction technology faced with the problem of the knowledge bottleneck. As the common knowledge of a special domain, ontology could provide the necessary information of semantic annotation. By introducing ontology to information extraction system, it's helpful for information extraction system to understand united the concepts and the relationship between concepts in the domain, so as to provide more valuable information to users. This paper takes the domain ontology as the study object, and the study about the construction of the domain ontology and its application in the information extraction system are developed as following:Firstly, this paper analyses and studies the domain ontology's application actuality in the information extraction system and the research actuality of the methods of the ontology construction,and establish the research target of constructing and applying the domain ontology in the information extraction system by taking advantage of ontology semantic superiority.Secondly, this paper proposes a method of constructing domain ontology which comprises the confirmation of the domain, the extraction of domain-specific concepts and the relationship among the concepts as well as the edition and storage of the ontology. In the process of acquiring concepts of constructing the ontology semiautomatically, we'd like to get the key words of the domain after mining domain texts, then apply an improved TF-IDF formula to extract domain-specific words from the key words collections and get ontology concepts after manually modifying the domain-specific words. Relations between the concepts are extracted by the approach based on WordNet and the pattern learning method. Finally, we edit and obtain the domain ontology by the tool of Protégé.Thirdly, we construct a small domain ontology based on the information extraction platform in the field of mobilephone, then combine two technologies the Ontology and Information Extraction and propose an text information extraction algorithm based on OWL ontology. In the algorithm, ontology as the knowledge frame of a domain is consulted. The aim of the algorithm is to extract structured instances of the frame which should be composed of OWL Ontology's semantic elements such as classes, properties and individuals to depict the extracted text information. At last this paper showed the result which was got from the processing of using this algorithm carried on the extraction to some handset domain sample homepage and analyses the extraction result.

Keywords/Search Tags:

Information extraction, Domain ontology, Ontology construction, Mobilephone

PDF Full Text Request

Related items

1	Construction And Implementation Of Domain Ontology Based On Plain Text
2	Methodothology And Empirical Research On Domain Ontology
3	A Research On Chinese Information Extraction Based On Construction Of Domain Ontology
4	Research On Key Technologies Of Ontology Construction Based On WordNet And Its Application In Security Domain
5	Research On The Ontology Construction And Its Application In Pension Insurance Domain
6	Research On Method Of Data Sources Selection And Constructing Domain Ontology
7	Ontology-Based Structured Information Extraction From Web Pages
8	Research On Knowledgechains-based Ontology Construction Methodology
9	Adaptive Web Information Extraction Method Research Based On Ontology
10	Research On Semi-automatic Construction Of Application Ontology Based On Chinese UGC Information Source