Research And Implementation Of Large Scale Rule-Based Backward Chaining Reasoning For The Semantic Web

Posted on:2017-07-15

Degree:Master

Type:Thesis

Country:China

Candidate:S Y Wang

Full Text:PDF

GTID:2428330485960809

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid growth of semantic data in recent years,the forward chaining reasoning method,which is good at handling static semantic data,gradually exposes flaws.The forward chaining reasoning method needs to re-reasoning the data each time the data updates,in order to maintain the integrity of the reaults,which results in its low efficiency.Thus,the backward chaining reasoning method,which is insensitive to data updates,started to become a new research direction.Backward chaining reasoning is a goal driven method.It infers the results according to the given rule set when a query comes.Backward chaining reasoning is more complex than forward chaining reasoning.Besides,backward chaining reasoning happens in query time so that it's time overhead is bigger than pure query.This is the greatest obstacle preventing it from being a practical method.Nowdays,most of the existing backward chaining reasoning systems are in a subordinate position of a RDF storage and query system and their reasoning ability are relatively weak.Due to its complex reasoning procedure and deep searching space of rule extensions as well as hard to be parallelized,backward chaining reasoning failed to achieve efficient reasoning performance or high scalability.Based on the existing backward chaining reasoning techniques and the in-depth analysis of the semantic rule sets,we proposed efficient and scalable large-scale parallel backward chaining reasoning methods for both RDFS and OWL rule sets on top of Spark.The main research work of this thesis can be classified into three parts:First,we deeply analysed the procedure of backward chaining reasoning and its dependency to semantic data in different stages.And then we designed a strategy to compute terminological closure before real time reasoning.Semantic data is different from general web data,it comes with a domain-related ontology.Ontology data describes relationships among concepts in a specific domain.The size of ontology data is usually small and the rapid growth of semantic data increases instance data size but not the ontology data.In a rule set,a useful rule includes at least one ontology triple as its anticendents.So the extended patterns may have many duplicate terminological patterns and recomputing these duplicate patterns will consume a lot of time.We pre-calculate terminological closure and reuse the closure in real time reasoning stage.This method can reduce many duplicate patterns,and as a result,our method can reduce the reasoning time a lot.Second,we designed optimization methods separately in reverse reasoning procedure,querying procedure and forward reasoning procedure of a backward chaining reasoning precess.All the optimizations contribute to the preformance improment of backward chaining reasoning.In reverse reasoning procedure,we designed a strategy to cut the useless reasoning branchs according to data dependencies between different reasoning levels;we designed an optimized pattern selection function to determine a best executing order.In querying procedure,we designed an RDD based storage model and a strategy using pre-shuffle technique to skip unrelated data when doing global scanning.In forward reasoning procedure,we designed a binding propagation method and a free variable method to optimize full query patterns;we designed an optimization against redundancy reasoning patterns which reduces re-calculation and duplicate data;we designed an optimization against join operations which reduces I/O and network overhead.Finally,we designed and implemented our backward chaining reasoning algorithm on top of Spark.Spark is one of the most popular big data processing frameworks due to its good fault tolerance and high scalability and simple deployment.Our method based on Spark has very good versatility.Experimental results on both synthetic datasets and real-world datasets show that our method achieves several seconds to tens of seconds of reasoning time on large-scale datasets of hundreds of millions triples as well as high data scalability and node scalability.

Keywords/Search Tags:

Semantic Web, Backward Chaining, Rule-Based Reasoning, RDFS, OWL, Spark

PDF Full Text Request

Related items

1	Research And Implementation Of Large Scale Rule-Based Reasoning For The Semantic Web
2	A Spark Based Semantic Reasoning Engine And Its Application
3	Research On RDFS Ontology Debugging Using Distributed Computing Technologies
4	The Research Of Using RIF To Describe The Semantic Web Rule And Reasoning RIF
5	Research And Application Of Elastic Semantic Reasoning For Large-scale Knowledge Graph
6	Efficient Real-time Semantic Data Stream Processing Based On Forward And Backward Chain Reasoning
7	Research On Semantic Search Based On Ontology Repository Reasoning
8	Reasoning Service For Rule-based Tourism Ontology
9	Enhancing Rule-based Ontology Reasoning On Spark
10	Research On Context-Aware Computing Technology And Its Application In Semantic Web Services