Font Size: a A A

The Research Of Distributed Storage And Indexing Scheme Of Large Scale RDF Knowledge Graphs

Posted on:2020-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:W Q XuFull Text:PDF
GTID:2518306518966829Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The knowledge graphs have a good representation in describing various entities and relationships in the real world.It is a key technology in the field of artificial intelligence.The research interest continues to increase with the development of artificial intelligence.The resource description framework RDF,as a standard data format for describing large-scale knowledge graphs,is also widely used in various fields and is closely related to our lives along with the rise of knowledge graphs.For example,it is used to help search engines find answers that better meet users' needs and to optimize a specific description of a thing.The extensive use of RDF data has led to its increasing volume of data,so an efficient data management system is urgently needed for effective management.However,at present,no matter based on relationship or native RDF data management system,most of them achieve the requirements of fast management in the single system or rely on excessive storage overhead,which is insufficient to meet the needs of the current state of data development.Therefore,it is necessary to design an optimized scheme for RDF data based on the distributed environment.This paper proposes a distributed storage and indexing scheme of RDF knowledge graphs data named RDFSIS and a corresponding query optimization strategy named QOS.The goal is to achieve efficient management of huge RDF data on a distributed system with moderate storage overhead.Efficient management of RDF data.The RDFSIS scheme is a combination of three methods,including extracting relationships to construct entity indexes,ontology partitioning,and connection classification processing.These methods are used to extract relationships between data entities and construct entity class indexes,process internal data of entities,construct predicate indexes,and strengthen the relevance of data within the entity,use connections to reduce the complexity of data operations and store data optimally.The QOS strategy includes a query predicate positioning algorithm and a query execution optimization processing algorithm.The index is used to shorten the scope of data retrieval,modify the query to reduce the connection complexity of the data,and improve the speed of data retrieval,thereby ensuring the comprehensive performance of the RDF data management system.Through the contrast experiments,the feasibility of the scheme is verified on the synthetic data set and the real data set respectively.A large number of experimental data on different types and scales of datasets,and different types of queries have shown that the scheme has optimization effects.The query efficiency is higher than the original system,which proves the real availability of the proposed scheme for RDF knowledge graphs data on large-scale distributed systems.
Keywords/Search Tags:RDF knowledge graphs, Store and Index, Distributed System, Data Management
PDF Full Text Request
Related items