Font Size: a A A

Design And Implementation Of Tobacco Big Data Analysis System Based On Spark

Posted on:2020-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:X F LiuFull Text:PDF
GTID:2428330572981088Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of the Internet,big data and cloud computing have become the hottest topics in the IT industry and academia.In the production process of cigarette products,tobacco companies will produce a variety of data,including data in the production process,product quality data,equipment operation data,production environment data and other multi-dimensional data.In order to meet people's requirements for the quality of cigarette products,tobacco companies pay more attention to the control of cigarette quality.Faced with a large number of historical data on tobacco production,data processing systems based on relational databases have high construction costs and low analytical capabilities when storing,processing and analyzing data.Apache Hadoop and Spark are the most popular big data processing technology frameworks with high performance,low cost and high scalability.So how to effectively use these big data computing frameworks to manage tobacco data is a major challenge for the tobacco industry.Therefore,in this context,this paper designs and implements a Spark-based tobacco big data analysis system.Firstly,this paper takes the demand analysis of tobacco big data analysis system as the starting point of system construction,and focuses on the functional and nonfunctional requirements of the system,and divides the functions of the system into tobacco data source management,data query analysis and data analysis.The main modules,such as the rights management,cluster monitoring as the system's auxiliary function module,it provides a guarantee for the stability and security of the system.Secondly,based on the research of Spark open source framework and the analysis of system requirements,this paper designs the infrastructure of big data analysis system,and the system is divided into a storage layer,a computing layer,a Web service layer,and a user operation layer four-layer architecture of tobacco data through a layered model,which lays a foundation for the realization of tobacco big data analysis system.Finally,this paper verifies the correctness and usability of the design and implementation of the Spark-based tobacco big data analysis system through system deployment and functional testing,and proves that the system can meet the needs of big data management and analysis of tobacco spinning production.
Keywords/Search Tags:Spark, Tobacco silk production line, Data management, Offline batch processing, Real time monitoring
PDF Full Text Request
Related items