Font Size: a A A

Research On Migration Methods Of Storage Virtualization For Industrial Big Data

Posted on:2017-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:J H YuanFull Text:PDF
GTID:2348330491457958Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Nearly half a century,under the domination of Moore's Law,information technology presents unprecedented prosperity.The Internet technology has a great innovation.According to IDC prediction,the amount of data generated per year under the current situation up 8ZB.By 2020,this figure will gradually increase to 40 ZB,which indicates that the era of big data has arrived.With the breakthroughs made by big data to industrial enterprises constraints,it will create several times higher value than the current value in the industry chain even among cross-industry chain.Mc Kinsey had a prediction for seven major areas such as transportation,finance,etc.,that the economic value created by data is up to four trillion dollars per year.Some experts even predict,when data realizes open circulation and becomes the major factor of production,human society will truly enter the era of the industrial Internet.Under normal conditions,the enterprise data information system contains a number of different business systems,and each business system also includes its own online business systems,archiving and backup systems.Considering enterprise cost,the data storage system will migrate the data of online business platform to the back-end big data platform.But the data migration process is extremely complex and many problems need to be addressed.Among these problems,this paper studies two of them.First,the efficiency of data migration needs to be improved in the migration process from online data to big data platform.Second,after the data migration to big data platform,the overhead cost of data dynamic migration between each node needs to be reduced.For these two problems,we proposed method of data migration based on task scheduling mechanism and method of data migration based on migration cost sensitive.Details are as follows:First,this paper introduced some key technologies in detail,such as the distributed architecture of MapReduce,HBase,distributed file system of HDFS,Key/ Value storage systems,etc.Then the basic principles of PSO algorithm and ABC algorithm were studied.For the requirement that online data is migrated to the big data platform,this paper proposed a method of data migration based on task scheduling mechanism.To verify and analyze the above methods,we used the Hadoop architecture,and comparedthe result with the default FIFO scheduling mechanism of Hadoop.For big data storage systems,data migration is the key technology to realize dynamic expansion and elastic load balance between different nodes.How to reduce the cost of migration is a big problem for provider to solve.Most existing methods for data migration are for non-virtualized environment.For big data storage systems,these methods are often not applicable.To solve this problem,we will put data migration issue into the load balance scene,use the migration cost model based on the area,and propose a data migration method to reduce system overhead cost.According to the implementation process of data migration strategies,we use Hadoop platform to test the effectiveness of our data migration methods,and by comparison with other methods or systems to evaluate and analyze those methods.
Keywords/Search Tags:Hadoop, Task schedule, Data migration, Load balancing, Migration cost
PDF Full Text Request
Related items