Font Size: a A A

Research On Performance Modeling Of Spark Computing Framework Based On GPU

Posted on:2020-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:M L WuFull Text:PDF
GTID:2370330590984263Subject:Engineering
Abstract/Summary:PDF Full Text Request
Latest research of machine learning,deep learning and data analysis tools proposed new requirements and challenges to existing computing systems and architectures.To adress this challenge,Spark?an open source big data computing framework has emerged.Spark extended the previous Map-Reduce programming framework,it solved the bottleneck of poor fault tolerance and high I/O load of Map-Reduce framework through on-memory computing.Spark optimized the key problems of batch processing,interactive query and flow computing in big data computing,and was perfectly compatible with Hadoop and its ecosystem.Spark is limited by CPU efficiency and memory,the difference between application requirements and system performance is increasing day by day.As the computing platform of Spark,CPU will not meet the needs of efficient computing.Compared with CPU,GPU has congenital advantages in the field of high performance with parallel computing.By utilizing parallel computing resources on GPU,the task processing efficiency of Spark system can be greatly improved.Based on the thorough understanding of the advantages of Spark computing framework,this paper realizes the analysis of system performance and process modeling of GPU parallel computing under Spark framework.At present,the performance modeling of GPU-based Spark computing framework is still in primary stage.To mine and analyze GPU-based Spark computing framework deeply,this paper proposes a Spark+GPU system modeling method based on queuing theory.Selects M/M/n/m model with multi-service window mixed system,obtains the stationary distribution and main performance indexes of the system,which can guide performance optimization of online system.The main contributions of this paper are as follows:(1)In this paper,for the first time,queuing theory is used as a modeling tool to construct a quantitative mathematical model of Spark+GPU.According to the characteristics of GPU multi-threaded parallel operation,the M/M/n/m queuing model with multi-service windows mixed system is selected and its calculation method is proposed.(2)While the M/M/n/m queuing model with multi-service windows mixed system didn't consider the order of data arrival,processed the data of many applications following the principle of first come first served without distinction,ignored the inherent priority characteristics of practical applications.To adress this problem,a non-strong occupancy priority M/M/n/m model is proposed,which can reduce the average queuing time of data in the system and improve the operation efficiency of the system.By comparing the two model test data,the result shows that the average waiting time of priority queue in the system is significantly less than that of non-priority queue with the increase of system traffic.(3)To M/M/n/m model with multi-service window mixed system,under the regular memory bandwidth of GPU,In this paper,the appropriate number of threads is obtained by using Matlab simulation data planning to minimize the average waiting time of data in GPU,so as to maximize the utilization rate of GPU resources.
Keywords/Search Tags:GPU, Spark, Queuing theory, M/M/n/m modeling
PDF Full Text Request
Related items