Font Size: a A A

Research And Implementation Of Energy Efficiency Scheduling Based On DVFS In Spark On YARN

Posted on:2020-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:E J MaFull Text:PDF
GTID:2428330590971755Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Through the explosive growth of computer information,the Internet has been pushed &widened into an era of Big Data.With the increasing demand for large-scale data computing,the scale of cloud computing clusters has expanded dramatically.Due to its energy consumption of big data computing,platforms have become increasingly prominent;Cloud service providers have taken an increased cost of capital causing irreversible damage to the natural environment.This controversial issue between reducing energy consumption and satisfying user service level agreement(SLA)has become highly severe in contemporary researches.There are three domains to be addressed:1.This thesis design and implement an energy-saving dispatching system based on DVFS,building a frequency-based CPU energy consumption model.Optimizing the infrastructure of the native Spark on YARN,the state monitoring module oversees and obtains the state information during the running of the application.This quantitatively analyzes the energy consumption of the application through the energy consumption evaluation module,and also utilizes the DVFS technology through the frequency adjustment module.The aim is to dynamically adjust the CPU frequency.This will provide a more supportive platform for subsequent research.2.The second domain is to design the frequency-aware YARN layer energy-saving strategy based on DVFS.For the diversity in applications,this thesis will selectively test three benchmark applications and analyze its computing and energy performance at different frequencies.Its intention is to meet the minimum energy frequency under its SLA standard.As for unknown target applications,the thesis will attempt to cluster it with the benchmark application by K-Means algorithm,locate its most similar benchmark application and processor frequency via DVFS technology.Pre-processing is to achieve energy-saving effects on the premise of ensuring computational efficiency.3.Lastly,the thesis will aim at the problems of efficiency in YARN layer energy-saving strategy that is caused by data skew when the data size is large.The Spark layer scheduling algorithm is a Frequency-Aware Energy-Saving Strategy based on DVFS,FAESS-DVFS2.0,and a dual-layer frequency-aware energy-saving strategy based on DVFS is proposed.Combining with the characteristics of the Shuffle mechanism,the DVFS technology is used to dynamically adjust the CPU frequency of nodes deployed in each stage of the Stage life cycle,reduce node idle time,minimise energy consumption,and improve and complete timed related tasks.Simultaneously,the DAG map is used to calculate the weight of different stages.For the calculation nodes with better weight assignment,the calculation nodes with better performance can reduce the idle operation time of each node and further improve the energy saving effect under the premise of ensuring SLA.
Keywords/Search Tags:DVFS, Spark on YARN, Energy Saving Scheduling, Big Data
PDF Full Text Request
Related items