Font Size: a A A

Research Of Expansion Method On COStream Oriented To Spark

Posted on:2022-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2518306572991429Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The optimization of cloud computing application's execution efficiency and resource utilization is an important research area.Application configuration information is a key factor affecting the efficiency of the entire cloud operating system.Spark is a popular cloud computing framework.As a source-to-source data stream compilation language,COStream can fully tap the parallelism in the target program.In view of the fact that both COStrean and Spark use DAG graphs to record the flow and operation of data,the best configuration information can be generated by pre-analyzing the internal parallelism of the Spark program through COStream.In order to generate the best configuration information as the coordination label in the cloud operating system,we proposed a method of automatically generating Spark applications and the best configuration information through COStream pre-compilation.Compared to the method manually specified by the programmer,this is much more accurate.In order to implement the pre-compilation,the COStream compilation framework is modified,and the syntax and structure of COStream for the Spark platform are expanded,so that it can generate application files for Spark Core and Spark ML.In order to generate accurate resource allocation information,the original partitioning algorithm of COStream has been modified.We selected the key resource configuration parameters in the Spark application,designed the calculation rules through the analysis of the internal structure of the program and the correlation between the parameters.We selected Word Count,K-Means and Logistic Regression algorithms for experiments.The experimental results show that the expanded COStream can correctly generate the corresponding Spark target program and the corresponding configuration information.The resource configuration information generated through the analysis of the internal structure of the program is the best resource configuration parameter,which can be used as a resource demand tag to be applied to the cloud operating system to optimize the efficiency of program execution and improve the utilization of underlying resources.
Keywords/Search Tags:Spark framework, COStream streaming programming language, Language extension, Configuration optimization
PDF Full Text Request
Related items