Font Size: a A A

Research And Implementation Of Load Balancing Algorithm For Offline Data Migration

Posted on:2016-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:2348330512470869Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of cloud computing,internet of things and social network,data produced by human has been growing with an unprecedented speed,resulting in the big data era.In this era,the competitiveness of a company depends mostly on the efficiency of mining knowledge from data.Especially in the searching business,mining valuable rules from the massive data can affect the users'searching experience,increase the traffic conversation rate,and even indicatethe developing direction of the business.With the continuously increasing of data's minable value,search offline business has been faced with great challenges.Massive data migration has become unavoidable in data-centric business like search offline,and it largely determines the quality of search offline business.It is very necessary to design a high efficient and scalable data migration system.This thesis studies load balancing search offline data migration approach.Firstly,this thesis presents the data migration model,and proposes the optimization goals based on the analysis of the factors affecting the migration performance.Then,this thesis describes the system design,and conducts the optimization study from two aspects under the guide of the optimization goals:From the data source level,proposes an approach called LBS(load balancing sharding)to reload the data onto a distributed system,which guarantees the balancing distribution of data and satisfies the scalable demand;From the migration job level,proposes an algorithm called Astraea to schedule jobs appropriately,avoiding data source hotspot,thus improving migration performance.Finally,this thesis verifies the effectiveness of the optimization approaches proposed through sufficient experiments.Experimental results show that the proposed approach LBS from the data source level can effectively load the data to a scalable distributed system,providing basis for the high concurrency of data migration and meanwhile guaranteeing the load balance;that the proposed algorithm Astraea can appropriately schedule the jobs to avoid query hotspot from the job aspect,thus improving the migration performance.
Keywords/Search Tags:Load Balancing, Data Migration, Horizontal Scale, Hadoop Yarn
PDF Full Text Request
Related items