Font Size: a A A

Formal Modeling And Performance Analysis Of Spark System

Posted on:2020-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y XieFull Text:PDF
GTID:2428330575493578Subject:Computer science and technology
Abstract/Summary:PDF Full Text Request
As a distributed large-scale data computing processing framework,Spark has been growing in number of users due to its excellent performance and compatibility.By evaluating its performance,it can maximize the utilization of system resources and reduce the system idle rate.Performance Evaluation Process Algebra(PEPA)is a kind of stochastic process algebra.It constructs the system model in algebraic way,decomposes the system into component interactions,which can clearly reflect the hierarchical structure of the system,and reduce the the difficulty of constructing and analysing of the model.Stochastic Petri Nets(SPN)have both intuitive image descriptions and strict formula definitions that clearly reflect the changing state of the system.This paper has carried out the following work for the performance evaluation and analysis of the Spark system:Firsts build a formal model of the Spark system.By analyzing the overall architecture and operation process of the Spark system,the Spark system is abstracted into the interaction of each component,and the system's PEPA model is obtained.The PEPA model is converted to an SPN model according to the SPN conversion rules.The relevant model parameters of the two models are given separately.Second,extract the performance indicators of the system model.The fluid flow approximation method and stochastic simulation method are used to extract the important performance indicators in the model:response time,utilization and throughput.In order to simulate the process of executing a service in the system,this paper proposes a service flow acquisition algoritiim and a response time simulation algorithm.By tracking the transfer process of the transition and counting the time of each transition,the response time is finally obtained.For the PEPA model,an approximate solution:fluid flow approximation method is adopted,which approximates tbe state transition process by differential equations.Third,evaluate and analyze the performance of the Spark system.The perfornance of Spark system was evaluated and analyzed from three dimensions:response time,utilization and throughput.The performance of the Spark system is evaluated by analyzing the impact of various parameters such as different task numbers,different activity transfer rates,and different component number on system performance indicators.
Keywords/Search Tags:Spark, PEPA, SPN, Fluid Flow Approximation, Performance Analysis
PDF Full Text Request
Related items