Font Size: a A A

Research On Task Scheduling Mechanism For Big Data Analysis

Posted on:2019-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q WangFull Text:PDF
GTID:2428330611993634Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information and Internet technologies,data is exploding.Therefore,mining and analyzing data to gain more value is becoming more and more necessary.This paper focuses on the characteristics of big data mining analysis tasks and explores the energy consumption prediction model and task scheduling mechanism of big data analysis tasks.The main work is as follows:(1)The time consumption prediction model is established for big data analysis tasksDue to the large computational cost of big data mining analysis tasks,it is extremely expensive to determine the optimal resource configurations of a task by perform the configurations one-by-one.This thesis proposes a regression-based big data analysis task time consumption prediction model to predict the execution time of big data mining analysis tasks under different resource configurations.First,the Spark cluster environment is built to test the actual operation of multiple big data sets.Then,the data description set is designed to determine valid data feature items.Finally,five different regression models are used for data training and prediction.The best model is selected as the prediction model by experiment,which provides the scheduling basis for the task scheduling mechanism research of big data analysis.(2)The task scheduling mechanism is designed for big data analysisThis thesis comprehensively and deeply surveys the existing task scheduling algorithms.Compared with the traditional task scheduling,this thesis summarizes the big data mining analysis task scheduling into three levels of task scheduling.Meanwhile,this paper proposes a task scheduling mechanism based on greedy-genetic.With the goal of optimizing the total execution time of the task,five different task scheduling strategies are proposed for three levels of task scheduling.Through simulation experiments,the performances of the task scheduling strategies are compared.
Keywords/Search Tags:Data Analysis, Task Scheduling, Greedy Genetic, Regression Model
PDF Full Text Request
Related items