Font Size: a A A

The Research Of Distributed Indexing Scheme For Large-scale Semantic Data Based On Linked Data

Posted on:2013-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2268330392970603Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Linked Data project, enormous RDF data have beenpublished on the Web. A scalable system is required to provide an efficient retrievalfor large-scale RDF data.This paper presents a distributed inverted indexing scheme for large-scale RDFdata, and adds a semantic factor to the traditional information retrieval ranking model,in order to provide users with the keyword search service of RDF data. A scalableinverted index is built using the underlying data structure of Cassandra which is adistributed key-value storage system. We optimize the indexing scheme with thecharacteristics of RDF data model to effectively support the fast keyword search. Theloading, encoding and indexing procedures are implemented for RDF datasimultaneously using the MapReduce framework. The query mode with secondarykeywords enables the system can intelligently identify the user’s query intent.Encoding classes in owl ontologies using ORDPATHs directly reflect the inheritancerelationship between classes in the coding level. Create the distributed inverted indexfor TBox, which can rank classes according to the secondary keyword, and thedefinition of TreeRank semantic ranking algorithm and formula is given.In summary, the retrieval scheme can create indexes efficiently for RDF data andsort the query results using semantic ranking algorithm, to provide users withsemantic retrieval services for large-scale RDF data. This work has a certain guidingsignificance for the research of semantic web.
Keywords/Search Tags:RDF, distributed indexing, MapReduce, semantic ranking
PDF Full Text Request
Related items