Font Size: a A A

Research On Legacy Code Refactoring Based On MapReduce Programming Model

Posted on:2020-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:W M WangFull Text:PDF
GTID:2428330596992650Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of big data era,the response speed of traditional legacy systems can no longer meet the requirements of users,but legacy systems carry a large amount of domain knowledge and critical resources,simply discarding and redeveloping them will result in huge waste of resources.As a business computing model,cloud computing has the advantages of virtualization,elastic scaling and on-demand service,which attracts many organizations to migrate legacy systems to cloud platform,so as to reuse legacy systems as well as improve the performance of big data processing.MapReduce is an effective programming model for processing big data in parallel mode in cloud computing,and it is a meaningful work to automatically map parallelizable legacy code into MapReduce model.At present,in the research on code refactoring based on MapReduce,there are little research on the refactoring from programming language to MapReduce,and the existing refactoring approaches and tools are not mature enough.In order to realize the effective refactoring of Java legacy code in cloud migration,this thesis proposes a refactoring approach for refactoring specific sequential code into MapReduce model,which makes the refactored MapReduce code more efficient in processing large data sets.The work of this thesis mainly includes the following parts: First,the data processing types of the loops that can be refactored are divided and determined.The legacy code involved in big data processing is divided into four types according to the business logic,the string matching algorithm are used to calculate the similarity between abstract syntax tree sequences,and the data processing type of the parallelizable loop is determined according to the obtained maximum similarity value.Then,the corresponding refactoring algorithms are proposed for each type to guide the refactoring process.The refactoring procedure consist of two parts,one is to determine the type of each statement in the original loop code by analyzing the abstract syntax tree corresponding to the parallelizable loop,the other is to refactor the statement into the corresponding part of MapReduce code template according to the refactoring algorithm.Finally,a refactoring support tool is developed based on the approach proposed in this thesis.The tool has realized four functions: location of parallelizable loops,conversion of abstract syntax tree,determination of data processing type for parallelizable code,and refactoring of legacy code.The experimental results indicate that the refactoring approach proposed in this thesis is effective,and can correctly refactor the sequential code into MapReduce code,and the target code has better performance than the sequential code in processing large data sets.The use of the refactoring approach is helpful to realize the reuse of legacy system resources and improve the processing efficiency of big data business.
Keywords/Search Tags:legacy system, MapReduce, code refactoring, refactoring algorithm, abstract syntax tree
PDF Full Text Request
Related items