The Design And Application Of Storage And Query Of Big Data Based On Cloud Platform

Posted on:2018-12-04

Degree:Master

Type:Thesis

Country:China

Candidate:X L Luo

Full Text:PDF

GTID:2348330518496340

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet, the amount of data is surging. Among them, most of data is heterogeneous. The successful application of Knowledge Graph in the field of information search promotes the research of fusion, storage and query of heterogeneous data.The ontology uses the unique identifier to mark the resources on the Internet, and can adds property for each resource or builds relations between different resources, which makes it flexible and expansible.With semantic web booming, recognized as an effective solution, ontology is widely used in heterogeneous data expression.In the field of computer, there are increasing number of researches of data management and application based on ontology. The traditional methods of information storage classify heterogeneous data into different tables according to its datatype, which lead to the results of information loss. With the increasement of network size and multi-source data,traditional databases and stand-alone environment are difficult to support storage and query of big data. Therefore, more and more researches based on cloud platform and distributed system are put forward to solve the problem. Although researches based on distributed system are not mature,they have great research significance and development prospect.Based on cloud platform Hadoop and non relational database HBase,this paper studies the integration, storage and query of massive heterogeneous data. The main work is as follows.1. Firstly, as the basis of subsequent distributed storage and query,the fusion of multi-source heterogeneous data is realized. Through parallel computing framework MapReduce, this paper realizes parallel ontology construction and fusion. In the process of construction,multi-source data are constructed into corresponding ontologies. In the peroid of fusion, various ontologies are fused into a semantically rich ontology.2.With the explosive growth of data, the bottleneck of traditional storage methods is increasingly prominent in two areas: data import and memory requirements. Referring to recently proposed distributed RDF data storage scheme, this paper proposes a storage model based on HBase in the consideration of occupied storage space and query response speed.3.Based on the HBase storage model mentioned above, we design the query strategy of triple pattern queries, basic graph pattern query and the keyword query. Triple pattern query is the foundation of basic graph pattern query, its response speed is decided by storage schema and database performance. In addition, by analyzing the structure of complex basic graph pattern query, an optimization method based on joint operation is proposed. The significance of keyword query is to improve the usability of query engine, it is achieved by taking advantage of the research results of basic graph pattern query. The effectiveness and efficiency of proposed strategies are verified by experiments on LUBM datasets.

Keywords/Search Tags:

RDF, storage, query, HBase, cloud platform parallel computing

PDF Full Text Request

Related items

1	A Research Of Distributed Storage And Parallel Query Of Spatial Data Based On Hadoop Platform
2	Design And Implementation Ocean Information Query System Based On HBase
3	Design And Implication Of Mini-files Storage System Based On Hbase
4	Research On Fault-Tolerant Parallel Skyline Query Technology In Cloud Computing Environment
5	Research And Implementation Of Disaster Big Data Management Methods Based On Cloud Computing
6	Research And Application Of Big Data Retrieval Based On Cloud Computing
7	Research And Implementation Of Large Collections Of RDF Data Storage And Retrieval Technology On HBase
8	Research And Application Of The Storage Of Hbase
9	Research On RDF Data Storage And Query Based On HBase
10	Research And Implementation Of Marine Information Query System Based On HBase