Font Size: a A A

Research On Optimizing Computing Resources Scheduling Of Supercomputer

Posted on:2020-10-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:J H FengFull Text:PDF
GTID:1488306548492444Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present,supercomputers have been widely used in many fields such as petroleum exploration,aerospace,biomedicine,climate meteorology,ocean simulation,advanced energy and materials,and basic science,and have become important tools for scientific and technological innovation.There are two resource utilization modes on supercomputers,free mode and charge mode.In the free mode,users can use computer resources for free after their applications are approved.In the charge mode,users need to sign business contracts and pay for their used core hours.In the charge mode,the user's job submission behavior,job running characteristics,and service fairness requirements are significantly different from the free mode.Therefore the existing research results of resource scheduling in the free mode cannot be directly adopted on charge mode supercomputers,which significantly reduces their resource utilization efficiency.It has become a challenging problem that needs to be solved immediately.The actual effect of supercomputer resource scheduling strategy depends heavily on the accurate analysis of user behavior and workload characteristics.Due to the lack of research on the issues related to supercomputer resource management under the charge mode,we collect job logs of two typical charge mode supercomputers,Tianhe-1A and Sugon 5000 A.We analyze in depth the cyclical patterns of user behavior and its relationship with job-related variables and service types with statistics,clustering analysis and other analytical methods.For the first time,we propose the concept of quota-constrained waiting time,and study its correlation with job runtime,calculation scale,and response time.Through analysis,we reveal the typical user behavior and workload characteristics of charge model supercomputers,and finds a series of conclusions that are different from previous related studies,including:(1)User behavior is not affected by job runtime and waiting time.(2)Thinking time is positively related to the core number and the coretime of the job.(3)User behavior is related to the type of service.(4)The quota-constrained waiting time accounts for the main component in the waiting time.(5)Traditional resource scheduling strategies cannot effectively reduce job waiting time,etc.The above research results can provide a basis for studying the optimization of supercomputer resource scheduling strategies under the charge mode.Under charge mode,the supercomputer widely adopts quota constraint mechanisms to ensure the fairness of the service.The mechanisms limit users from excessively using large amounts of computing resources.However,it also causes that idle resources can't to be allocated to quota-constrained jobs in the queue in time,which reduces the resource utilization.In order to solve the problem,we propose a fairnessefficiency tradeoff strategy on quota-constrained supercomputers.Through the proposed condition triggering mechanism,the priority of resource use for jobs within their quotas is ensured.At the same time,the adaptive upper limit of resource allocation and longterm fairness resource allocation approach improve the overall resource utilization on the basis of ensuring the fairness of services.The results of simulation based on real job logs show that the proposed approach can increase resource utilization from 85% to96%,which is close to Non-Quota's 97%,while the fairness index only decreases from Quota's 0.57 to 0.51,which is much higher than Non-Quota's 0.26.To further improve resource utilization,supercomputers have widely adopted backfill scheduling strategies.The strategies are based on the expected runtime of the job,and scheduling short jobs to run preferentially when long jobs are waiting for resources.However,under the charge mode,the application fields are diversified,and there are many types of programs.What's worse,the security and privacy requirements make it impossible to obtain the specific running parameters of jobs.Therefore it is more difficult to predict the precise runtime of jobs.To this end,we propose a method for classifying and predicting job runtime based on runtime monitoring.By combining the historical job logs with the monitoring information of the current job run,we adopt gradient-enhanced decision tree algorithm and deep neural network for classification and regression prediction.Within 5 minutes after a new job starts to run,we can predict the runtime of the job,with a prediction accuracy of 84.4%.Especially for jobs longer than 8 hours,the F1 Score exceeded 72%,which is higher than the 48% predicted by the similarity job Last2.
Keywords/Search Tags:Supercomputer, User behavior, Quota constrained, Job time prediction, Resource scheduling
PDF Full Text Request
Related items