Construction And Implementation Of Domain Ontology Based On Plain Text

Posted on:2017-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:R Guo

Full Text:PDF

GTID:2348330512455428

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the advent of the era of big data,huge amounts of Web pages contain lots of important information.On the one hand,the domain ontology is an indispensable part of the semantic web,which is extracted from the Web.On the other hand,it can be used for intelligent management of vast knowledge.The extraction of domain ontology can be divided into the following sub-tasks: 1)Extraction of the domain-specific term.2)Extraction of the domain concepts.3)Extraction of the taxonomic relation and the non-taxonomic relation.In the past,the construction of domain ontology depends on domain experts,so the process of ontology construction is time consuming.Especially,with the development of Internet,traditional method cannot manage the knowledge effectively.To reduce these costs,methods in the fields of Natural Language Processing(NLP)and machine learning(ML)are often used to making the process more automatic.This paper proposes and implements a new method for building Chinese domain ontology.Firstly,automatic crawler technology is used for Web news pages collection.Then domain terms,concepts and the taxonomic relationships are extracted out.Main work is as follows:1)A set of rules are created depend on Chinese lexical and syntactic features to extract the nouns and noun phrases as candidate domain terms.Then TF-IDF(term frequency–inverse document frequency)and DR&DC(domain consistent and domain relevance)algorithms are used to implement the extraction of terms separately.2)Extract the domain concepts from the domain terms by using logarithmic likelihood ratio and information entropy algorithm.This paper finds terms which are very similar to domain concepts by Word2 Vec algorithm,and expands domain concepts collection.The accuracy of the final results is improved by the definition information of online encyclopedia(Baidu encyclopedia and Wikipedia)in the connotation and extension of the concept.3)Extract taxonomic relationship by using rule-based and statistic-based methods.Firstly,part of taxonomic relationships are extracted out by using lexico-syntactic patterns and suffix matching algorithm.Secondly,more taxonomic relations are extracted out by using the similarity algorithms of the vector space,the Word2 Vec and the degree of the refinement.

Keywords/Search Tags:

Domain ontology, Ontology learning, Term extraction, Concept extraction, Taxonomic relationship extraction

PDF Full Text Request

Related items

1	The Application Research Of A Non-Taxonomic Relation Extraction Method Of Ontology
2	Discipline Ontology Learning And Semantic Annotation For Scientific Resources
3	Automatic Extraction Of Conceptual Relations For Constructing Domain-Specific Ontology
4	Research On Domain Ontology Concept Extraction And Relation Extraction
5	Research On Ontology Learning Methods For Text
6	Research On Concept And Relation Extraction Of Chinese Domain Ontology
7	Automatic Extraction Of Uyghur Ontology Concept Classification Relationship Based On Seed Bootstrap
8	Research On Key Technologies Of Ontology Construction Based On WordNet And Its Application In Security Domain
9	Research On Domain Ontology Learning Based On Chinese Texts
10	Research On Non-Taxonomic Relationships Learning Based On Domain Concept Knowledge