MDE-Based Approach For Mapreduce Bigdata Transformation Software Development

Posted on:2019-04-20

Degree:Master

Type:Thesis

Country:China

Candidate:B J Liu

Full Text:PDF

GTID:2348330545977462

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,research and utilization of big data processing has become an im-portant strength for industrial upgrading and the rise of new industries.Network data has grown exponentially,and data processing at the GB,TB-or even PB levels has become commonplace.These huge amounts of data are stored in various society sec-tors society and placed in various data environments and data platforms,which leads to a plenty of heterogeneous data.The widespread presence of heterogeneous data has severely hindered data exchange.At the same time,big data processing platforms represented by Hadoop and Spark have emerged.However,these big data processing platforms have their own unique programming models,specific platform details.The usage of these platforms can even requires the knowledge of specific programming languages.The learning threshold is high and the learning curve is steep.Considering the background mentioned above,it is desirable to design a method that can shield platform details and solve the data heterogeneity.We have designed the method to enable users to develop big data processing programs for the big data platforms without deep knowledge and consideration of data heterogeneity.This thesis proposes a model-driven software development method for MapReduce big data trans-formation.The method abstracts the massive data,describe the data transformation process with the model transformation,and make use of the code generation technol-ogy to generate Hadoop and Spark data processing programs.With the method,user can develop big data program and process heterogeneous massive data without under-standing the big data platforms nor considering data heterogeneity.We chose Ecore,one of the model’ s classical format as the model representa-tion of the source data and the target data,and the QVT-R standard published by the OMG organization as the description language of the model transformation.Then we introduce a platform-independent imperative description language called Midcore as the intermediate language for big data processing code and QVT-R.Ecore and QVT-R generate corresponding Midcore descriptions,which can generate Hadoop and Spark code simultaneously.This method supports extensions,that is to say,Midcore can be mapping to code on other big data platforms.Based on this methodology,we implemented the tool QE2HS,which automatically generates Hadoop and Spark code with the Ecore and QVT-R descriptions.At the same time,relevant case studies are carried out in this thesis for the method and QE2HS tool.Case studies show that this method can generate large scale data processing program code accurately,implement the shielding of platform details and data heterogeneity,and simplify the coding complexity of big data program code.What’s more,code execution efficiency is acceptable.

Keywords/Search Tags:

Model Transformation, MapReduce, Hadoop, Spark, Code Generation, MDE

PDF Full Text Request

Related items

1	MapReduce Development Method For Data Transformation Based On Model Transformation
2	Research On The Implementation Of Bursty Events Detection Based On Spark
3	Research On Transformation Rule Modeling And Rule Code Generation
4	Research On Task Scheduling Algorithm Under MapReduce Framework
5	Research And Application Of ETL Code Generation Approach Based On Model Transformation
6	Research On The Performance And Optimization Of MapReduce Model In Hadoop Platform
7	The Mapreduce Model In The Hadoop Implementation Of Performance Analysis And Optimization Improvements
8	The Source Code Analysis And Performance Improvement Of MapReduce
9	Research On MapReduce Model For Fusion Architecture And Accelerated Strategy For Hadoop
10	Evolution Of Application Software Based On Hadoop Platform Method And Technology Research