Font Size: a A A

The Evaluation Of RDF Is-a Relationship Enrichment Method

Posted on:2017-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:B B HeFull Text:PDF
GTID:2308330488973451Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Is-a relationship is the basic knowledge unit in RDF datasets, which plays a key role in the knowledge-based applications and systems. However, the is-a relationships are always incomplete whether the knowledge base is constructed manually or automatically. In order to solve the problem of is-a relationship incompleteness in RDF datasets, researchers have proposed a variety of statistical approaches to enrich is-a relationships. However, due to the different characteristics of the underlying statistical models, these methods have different effects in different datasets. Therefore, for a given RDF dataset, the problem of how to choose an approach for completing is-a relationships is worthy of study. Existing works can be roughly divided into two categories:(1) Manual selection:Analytics of each method and the given dataset are performed manually and the most suitable method is chosen for the dataset. However, the performance of each method cannot be fully quantified and the evaluation process is susceptible to subjective factors; (2) Automatic selection:Each method is evaluated on the complete dataset and quantified by three indexes:precision, recall and F1-score. The drawback of automatic selection is its low efficiency, especially when the dataset is extremely large.To solve the above problems, we propose a novel approach based on matrix decomposition to evaluate the performance of different is-a relationship completion methods in large-scale RDF dataset. The proposed approach provides for knowledge base engineers an efficient way to choose a suitableis-a relationship completion method. Specifically, this paper has the following contributions:(1) Propose three metrics, which are weighted precision, weighted recall and weighted F-value, to evaluate the performance of is-a relationship completion methods. The facts described by these three metrics are precision, completion, and harmonic mean of precision& completion when a method predict positive cases on whole ontology, which can help users and knowledge base builders to choose a suitable approach for the given data.(2) Propose an efficient approach to evaluate the performance of multiple is-a relation completion methods by the above metrics. Firstly, this approach evaluates the performance of each method on a small sample of the whole dataset. Then the performance of these methods on the entire dataset is predicted based on matrix decomposition. This novel approach solves the inefficiency problem brought by a large-scale dataset.(3) Conduct experiments to evaluate the performance of six is-a relation completion methods on different RDF datasets. The results of the experiments prove that our proposed approach can evaluate the performance of is-a relationship completion methods efficiently, accurately and objectively. The experiments also show that, the higher the matrix density, the smaller the prediction error.
Keywords/Search Tags:RDF, Is-a relationship, Matrix factorization
PDF Full Text Request
Related items