Font Size: a A A

Design And Implementation Of Data Migration System Based On Hadoop Platform

Posted on:2021-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:X GaoFull Text:PDF
GTID:2428330602980889Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of information technologies such as big data and cloud computing,the volume of data generated in the production process of various industries has shown an explosive growth.With the continuous accumulation of data,the existing business platform of the enterprise is faced with such problems as insufficient performance and excessive resource consumption,which cannot meet the demand of high performance and high concurrency.The big data platform has huge storage capacity and supports complex calculation of large-scale data,and it can conduct deeper value analysis on data,so it is necessary to migrate some valuable historical data to the big data platform,which can not only relieve the production pressure of the existing business platform,but also explore new business directionThis thesis combines the actual needs of one of the top 500 communication enterprises in the world,designs and implements a data migration system that migrates data from the Teradata database to the Hadoop platform and automatically stores and archives data.According to the characteristics of structured data and unstructured data,this system designs and implements two different migration schemes,that is structured data migration based on MapReduce and unstructured data migration based on FTP.Compared with the existing migration tools,the system can fulfill some specific requirements,for example,cleaning data according to business logic and returning some data to Teradata.In addition,the system can automatically implement a series of processes,such as data extraction,data cleaning,data type conversion,data verification,data loading,data returning.Finally,according to the different business logic and scheduling cycle,it is stored in different levels of Hadoop platformAfter testing,all the functions of the system meet the expected design objectives and have good migration performance,which verifies the feasibility of the migration scheme.At the same time,this data migration system is deployed in a large-scale communication enterprise,and the actual migration results are relatively satisfactory,which greatly eases the production pressure of the original business system,enhances the analysis and processing ability of the large-scale data,and reflects the research value of the data migration system.
Keywords/Search Tags:Hadoop, Big Data Platform, MapReduce, Hive, Data Migration
PDF Full Text Request
Related items