Font Size: a A A

Mapreduce Parallel Refactoring Method For Legacy Code

Posted on:2022-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:R FengFull Text:PDF
GTID:2518306731992639Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of computer technology and user demand,the existing software system will eventually become a legacy system,but it contains a lot of domain knowledge and key resources,simply abandon or redevelop will cause great waste of resources.Cloud computing due to its features of parallelism,high reliability and on-demand resource allocation services,attracting organizations to migrate legacy systems to cloud platforms to reuse legacy systems and improve business efficiency.MapReduce is a programming model for parallel processing of large-scale data in cloud computing.It is a meaningful study to automatically convert parallel legacy codes into MapReduce programs.For the refactoring of legacy code,there exist some solutions.The existing methods are mainly aimed at the refactoring of simple business logic code,and there are problems such as the inability to guarantee the semantic equivalence of the code during the reconstruction process,the low degree of parallelization of the code after the refactoring,and the insufficient automation capability of the refactoring scheme.In response to the above problems,this thesis proposes a sequential code parallel refactoring scheme based on program intermediate code and MapReduce programming model,which while ensures the semantic equivalence of the code during the refactoring process,and achieves the maximum degree of parallelism of the code.The specific contents are as follows:In order to ensure the semantic equivalence of code in the refactoring process,this thesis uses Intermediate Code to implement refactoring,and proposes an intermediate code generation method based on GSA(Gated Static Assignment).In this method,the code to be refactored is first converted to GSA form through the program analysis framework,which is the equivalent representation of the source program and contains all the semantic information of the program.The GSA is then converted to intermediate code through defined conversion rules.This method not only realizes the preservation of the source program semantics in the refactoring process,but also considers the role of loops in parallel refactoring and preserves the loop structure.This allows the refactoring phase to take full advantage of the loop information and improve the parallelism of the code.In order to realize the maximum parallelism of code,this thesis defines a set of parallel refactoring rules based on the parallel operation in MapReduce programming model,and the rules implement the refactoring from the intermediate code to the executable program under the MapReduce model.In the process of parallelization refactoring,corresponding rules are applied to different code patterns to achieve maximum code parallelism.MapReduce has a variety of efficient implementations.In this paper,the generated MapReduce program is refactored to the Spark platform through custom mapping rules and code templates to realize the distributed operation of the program.In order to realize the automation of the refactor scheme,a tool is designed and developed based on the above refactor scheme,and the effectiveness of the tool is evaluated by the experiment using the benchmark program built by Stanford University.Experimental results show that the tool can effectively refactor Java serial code into the parallel program under MapReduce model,and the refactored code can improve the execution efficiency of the original business.
Keywords/Search Tags:legacy code, MapReduce model, code refactoring, cloud computing, parallel
PDF Full Text Request
Related items