Research On Data Placement Strategy And Skyline Query In Cloud Environment

Posted on:2015-11-20

Degree:Master

Type:Thesis

Country:China

Candidate:H S Jiang

Full Text:PDF

GTID:2298330467956856

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the deepening of informationization in science applying, social studies, commerceand life, efficient massive data storage and query are becoming more and more important. Inclimate science, manned spaceflight, high-energy physics, life sciences and other scientificresearch, as well as some business computing fields such as Web applications and socialnetworks, reasonable placement and real-time query of massive data become a key problem tobe solved.The speed of the development of computer hardware is limited to the nature of thematerial itself, as results in the development bottleneck for computing and storage capacityofcomputers. Distributed and parallel computing has become an important way to solving hugeamounts of data processing. Cloud computingï¼Œa new computing modelï¼Œarises. Cloudcomputing has been hailed as a "revolutionary" calculation model. Cloud computing centersdistributed on the Internet have a highspeed and secure data transfer rate. With large-scaledistributed cluster as the main body, cloud computing makes storage and computing resources,distributed in different geographical positions, a virtual resources pool through virtualizationtechnology and provides the ability to store, analyse and process huge amounts of data.In cloud computing environment, the mass data needed by all kinds of applications e isstored in different data centers. For all kinds of applications, how to efficiently access andquery these data distributed in different data centers is a key problem to ensure the systemperformance. Therefore, reasonable placement strategy and the efficient query algorithm havea vital significance for reducing the number of data sets across different data centers.Skyline query is a kind of important type of query, which is widely applied inmultiple-criteria decision making, data visualization, navigation systems, geographicinformation system, etc. Along with the explosive growth of data, the cloud computingplatform to is becoming an effective way to process Skyline query for large data. Due to thehuge amounts of data, Skyline query result set is far less than the original input data, soeffectively filtering the initial input dataand reducing the data transmission across the differentdata centers not only affect the the global execution speed, but also affect the calculationcost.Facing to the two problems of data placement and Skyline query based on cloudcomputing model, this paper mainly completed the following work:(1) A two phase data placement strategy in cloud environment was proposed. Specifically,in this paper, the existing data dependencies are extend by defining the the dual dependencybetween data and applications.At the same time we consider the bandwidth of each datacenter and load balance. t. We conduct extensive experiments, and the experimental resultsdemonstrate the effectiveness of our methods.(2) A grid Skyline query processing algorithm is proposed. Specifically, first of all,algorithm based on the MapReduce is proposed, and then the optimization version of SQBDFG was further presented. The above two take advantage of the relationship betweenthe grid fast filtering by using the grid division, in order to reduce transmission overhead.Through experiments on the Hadoop environment, we verify the proposed algorithm to havean excellent performance.

Keywords/Search Tags:

cloud computing, Hadoop, data placement, Skyline query

PDF Full Text Request

Related items

1	Research On Fault-Tolerant Parallel Skyline Query Technology In Cloud Computing Environment
2	Skyline Query Research For Massive RDF Data Under Distributed Computing Environments
3	Research And Implementation Of Skyline Query Algorithm In LBSN Environment
4	Research On Fault-tolerant Parallel Skyline Query Technology Over Uncertain Data Streams In Cloud Computing Environment
5	Research On Skyline Query Processing Techniques
6	Research On Distributed Data Query Based On Hadoop
7	Research Of Skyline Query Processing Technology On Uncertain Data Stream
8	Research On Parallel Skyline Algorithms And Their Applications In Cloud Computing Environment
9	Research On Privacy Protection Of Skyline Query For Mobile Cloud Location Service
10	Research On Top-k Skyline Query In Multiple Environments