Research Of Uncertainty RDF Keyword Search Based On R-clique

Posted on:2015-07-31

Degree:Master

Type:Thesis

Country:China

Candidate:S Zhang

Full Text:PDF

GTID:2308330482954488

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

RDF which is short for Resource Description Framework is the basic markup language of the semantic Web net. RDF is widely used in many fields. Because the method of extracting ontology, the method of marking resource and the technology of measurement usually has errors and noises, uncertainty RDF data are widespread. In recent years, the research on querying uncertainty RDF data has gradually become a hot spot. Since uncertainty RDF data can be modeled as uncertainty RDF graphs, the research on keyword querying over uncertainty RDF data is actually the research on keyword querying over uncertainty graphs. On the basis of the existing research, this thesis proposes two algorithms based on r-clique for keyword querying on uncertainty RDF graphs:KSABR and HABR.The word "clique" whose meaning is "maximal group, sub-group" stands for the sub-graph of an uncertainty graph.The letter "r" is a variablestanding for the distance threshold value. So "r-clique" stands for the sub-graph which contains all the query keyword nodes and the distance between any two nodes in it is not greater than the given value r. In order to improve the query speed, this thesis proposes an approximation algorithm for building r-cliques which produces r-cliques with 2-approximation in polynomial delay.KSABR (Keyword Search Algorithm Based on r-clique) maps the research on keyword querying over uncertainty RDF graphs to the research on finding r-clique over uncertainty graphs. In order to improve the quality of query results, this thesis proposes an algorithm with the higher precision on the basis of KSABR:HABR (High-precision Algorithm Based on r-clique). HABR utilizes the scoring function to sort results, and then return the top-k results to users.In order to accelerate the speed, two index structures are being designed:KI (Keyword Inverted Index) and PI (Probabilistic Inverted Index). KI is used to store the mapping relationship between keywords and keyword nodes, and implement structural pruning and probabilistic pruning. PI is used to store the mapping relationship between keywords and r-cliques, and help generate the function score ().The experiment proves that the algorithm KSABR proposed by this thesis has a good performance on response time and the algorithm HABR has a good performance both on the response time and the quality of results.

Keywords/Search Tags:

uncertainty RDF graphs, keyword search, r-clique, pruning technique, top-k query

PDF Full Text Request

Related items

1	Caching And Query Pruning Strategy For Efficient Search Engine Query Execution
2	Research On Aggregate Keyword Routing Query In Spatial Databases
3	Research And Implementation Of Keyword Query Technique In A Seamless Intergrated Relation-XML Database System
4	Query By Example On Multiple Knowledge Graphs
5	Research On Query Expansion In Database Keyword Search
6	Efficient TOP-K Keyword Search On XML Sreams
7	Query Suggestion Techniques For Keyword Search
8	Approximate Query Method Based On Relational Database Keyword Semantic Research
9	Keyword Query Over The Database Based On User Feedback
10	Research On Probabilistic XML Keyword Search Processing Method