Font Size: a A A

Research On Virtual Machine Migration Algorithm Based On Ant Colony Algorithm Under Data Skew Condition

Posted on:2018-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:L T XiongFull Text:PDF
GTID:2348330542959896Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The cloud computing technology is a new kind of computing paradigm.Servers(such as computing services,storage facilities services,application services,etc.)could be acquired at anytime and anywhere,as required and fast through the internet.At the same time,tasks could be distributed in the clusters,which are made up by a lot of servers,through the cloud computing.Now,the cloud computing technology has been already applied in the area of scientific computing,virtual environment,biological information,and energy,etc.As the fast development of internet,the data size of network transmission and data center have increased as geometric ratios compared with before.The conventional saving mode suffers from high-cost,poor expandability,complicated design,and the increased maintenance engineer also can't catch up with the requirement of fast increased data sizes.MapReduce is a candidate to solve the whole issue of large scale data processing by using the principle of locality to divide the issue fist,and it has been applied in various walks of life.However,disadvantages appear when the input data skew is too strong.This leads to a uniform distribution of key value,increase the amount of network traffic between the Map task and the Reduce task,taking up a lot of bandwidth,which further prolongs the execution time and decreases the system performance.This thesis bases on how to effectively migrate the virtual machine to minimize the network throughput of intermediate data when skew data exists.The main works include:(1)Improve reservoir sampling algorithm has been employed to treat the MapReduce intermediate data and evaluate the frequency of occurrence of key value.This sampling has been finished based on one independent MapReduce application,and theoretical demonstration has also been added in the discussion.The reservoir sampling algorithm could effectively evaluate the key value distribution of original input data.(2)According to the statistic frequency distribution results of all the intermediate key values and ant colony algorithm to install operation of the virtual machine,the optimal relationship between virtual machine and physical machine could be obtained.Finally,big and long-distance data transmission could be migrated from virtual machine to one physical machine.(3)OpenStack and Hadoop platforms were set up and tested.During the experiment,when the input data was huge and skew was high,compared with the migration and non-migration strategy of random mode,an obviously optimizing realized by the ant colony algorithm of sampling data distribution.This is to say,the smaller traffic exchange in the rack,the greater degree of data localization,and better optimizing of execution time.
Keywords/Search Tags:Cloud computing, Hadoop, Data skew, Sampling, Ant colony algorithm, Virtual machine migration
PDF Full Text Request
Related items