Font Size: a A A

Research On The Key Technology Of Domain Ontology Coverage Evaluation

Posted on:2013-09-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L B OuFull Text:PDF
GTID:1268330401979183Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a conceptual model of describing information in semantic level, domain ontology is becoming more and more important in intelligent information retrieval, knowledge acquisition, natural language understanding and Web information processing. At present there are a large number of domain ontologies to be constructed but their qualities are uneven due to the uncertainty of construction principle of domain ontology, the inconsistency of building methods, the diversity of constructing tools and the differences level of domain knowledge of ontology engineers etc. At the same time, with the new domain knowledges and new applications emerging, in order to make the domain ontology to cover the new knowledges in a timely manner, domain ontology evolution have to be performed constantly. How to evaluate the quality of domain ontology is very important to application sytems.Domain ontology coverage is one of the important evaluation indexes of domain ontology content. It contains both concept coverage and relationship coverage, which reflects the full extent of concepts and relations of domains by ontology contained. It also determines the correlation of the ontology with the domain. The measurement results of concepts coverage and relationships coverage can provide a reliable basis for the domain ontology learning and evolution, and put forward the views of the specific improvements in order to further perfect ontology contents. Coverage metric evaluation based on the golden standard is an effective means, but the absolute golden standard does not exist. Extracting the set of domain concepts and the set of relationship from the large-scale corpus as the relative golden standard is a realistic idea. According to the idea of obtaining the relative golden standard, this article did the research of domain ontology coverage evaluation. The mainly works as follows:(1) Analysing the domain ontology evaluation indexes and measurement methods, classifying and integrating ontology content evaluation indexes with four perspectives, such as Breadth, Depth, Horizon and Longitude, then building a system framework of domain ontology content evaluation named BDHL. Designing an extensible evaluation index tree can be customized by users. The analysing results show that the coverage evaluation is the basis of other evaluation indexes. Then the domain ontology content evaluation process model was given.(2) The existing methods are not accurate enough to extract domain concept from large-scale field corpus, especially they could not identify the compound concept effectively. This paper proposes a method of compound concept extraction based on a hybrid model, firstly we make segmentation processing for corpus texts and add entry label for each term, remove noise words and merge synonyms for the entry set. Then we count the weighted term frequency, the location affinity degree, the location matching degree, and make a stepwise estimation to identify composite concept with atomic terms. Ultimately we realize the extraction of multiple-compound concept via giving different compound depth. On the foundation of the extraction method, we obtained the documents which are correlated to software engineering from HowNet, and carried out the experiments with three different corpora for compound concept extraction. The results indicated the method has high recall and precision.(3) For extracting relation of concepts from domain text,the statistics-based approach can only determine that there is some anonymous relationship between the concepts and can’t determine the specific relationship name.After the domain corpus marked and domain concepts set completed,this article put forward a domain concept relation extraction model (DCREM) can effectively determine the relationship between domain concepts and obtain the specific relationship name.Firstly,through the location affinity,support and confidence to determine the existence of relationship between domain concepts, through statistical decision tree model to determine the predicate center word in the sentence, and then according to the dependency rule library, parsing of the sentence, getting the dependencies relation tree, judging the domain concepts whether supported by the predicate center word. Finally, based on the dependencies of the domain concepts, extract the domain concepts and predicate center words which meet the<subject, predicate, object> structure, get the relational triples of domain concepts. In the article, we take the domain corpusand the domain concepts of software engineering as experimental subjects, the experimental results show that this relation exaction method has a better recall rate and accuracy in simple sentence.(4) Accoding to the research works, we obtained the relative golden standard by extracting the set of domain conpets and the set of domain relation from large scale corpus in software engineering. And get the set of concepts and realtion from servral ontologies in software engineering. Then design the algorithms of evaluating the concept coverage and relation coverage of domain ontology. The experiments result shows the degree of ontology coverage can reflect the domain relevance of ontology.How to select domain corpus and how to extract domain relation in complex context are the further studies. Moreover, on the basis of the ontology coverage evaluation, how to sort the ontologies based on the relevance analysis, and how to evaulate the domain ontology cohesion and coupling based on the intersection of domains will be very interesting works.
Keywords/Search Tags:domain ontology, ontology evaluation, coverage, conceptextraction, relation extraction, domain relevance
PDF Full Text Request
Related items