Font Size: a A A

Research Of Keyword Search On Large Scale RDF Data

Posted on:2014-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:L J WangFull Text:PDF
GTID:2348330473953929Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
RDF (Resource Description Framework) is the basic markup language used in the semantic Web, which has been widely applied in knowledge organizing and management, and social networks. The scale of RDF data increases rapidly as the growth of semantic Web applications. RDF data has a typical graph model, complex structures, as well as large amount of text information. Therefore, some studies focus on how to efficiently process keyword queries on RDF data. To improve the performance of query processing and the quality of query results in the existing related work, a novel keyword search approach on large scale RDF data,named RAGS, is proposed based on the solution of approximate group steiner tree problems.In RAGS, a keyword search on RDF data is translated to the problem solving of group Steiner tree. Then, the problem is solved by reducing the group Steiner tree problem to a minimum Steiner tree problem. For those traditional minimum Steiner tree approaches that are not safe for reduction, an improved approach is proposed and analyzed in its time complexity and approximate ratio.In order to make the keyword search approach more user friendly in the case of large scale RDF data, a shortest path triple inverted index is designed. It improves the performance of real-time online keyword search by pre-computing all pair shortest paths offline. Furthermore, a top-k search algorithm for rapid and accurate response to user queries is proposed based on generating spanning trees in increasing order cost.The time cost of the index construction is the primary bottleneck for large scale RDF data. Therefore, a BSP based distributed all pair shortest path algorithm is proposed to speed up the index construction.The experimental results show that RAGS has a better performance in response time and result qualities, and the distributed algorithm based on BSP also has a good scalability.
Keywords/Search Tags:RDF, keyword search, large scale, approximation, group steiner tree
PDF Full Text Request
Related items