Font Size: a A A

Big Data Consolidation And Analysis Platform For Smart Environment Protection

Posted on:2018-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:R D FanFull Text:PDF
GTID:2428330596990035Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,the concept of “smart cities”,which involves leveraging fast-developing information technology to improve the governance of cities,attracted a lot of attention.The concept of “smart cities” emphasizes sustainable development and improvement of life quality,which makes “smart environment protection” a vital part of the implementation of “smart cities”.The concept of “smart environment protection” requires long-term,efficient,reliable and flexible consolidation and analysis for environmental data.However,these data are high-volume,rapidly-growing,heterogeneous,containing a mixture of stream data and batch data and coming from a variety of sources,and these features are major technical challenges of developing applications related to “smart environment protection”.In order to reduce the difficulty of developing “smart environment protection” applications,and help with the implementation of “smart environment protection”,this paper proposed a big data consolidation and analysis platform for “smart environment protection”,which supports the consolidation and analysis of environmental big data with long-term practicality.This paper designed the platform.First,this paper designed the data consolidation platform.In order to deal with real-time environmental data,this paper developed a JSON-based standard intermediate format.This paper also designed the architecture of data gateway layer to do the conversion to the intermediate format.In addition,this paper designed the consolidation procedure of data ingestion,data cleaning,data classification and data storage based on Kafka and Spark Streaming.Regarding data that are not real-time,this paper designed the NiFi-based consolidation procedure.Then this paper designed the data analysis platform.This paper designed stream data and batch data analysis platform based on HDFS,YARN,Spark and Spark Streaming.This paper also designed the architecture of temporary storage databases based on MongoDB and Redis.Then this paper designed the architecture of data presentation interface for data presentation applications,and the data presentation interface consists of a data presentation API layer and temporary storage databases.Finally,this paper designed the operations and support system based on Ambari and SaltStack.This paper also implemented this platform.First,this paper configured the operating environment for this platform,then implemented its key components,including its data gateways,a data cleaning and classification job based on Spark Streaming and an integrated data presentation API server.The implementation shows the design of this platform is feasible.In order to further evaluate the abilities of this platform,this paper also designed and implemented a “smart environment protection” application case,“regional PM2.5 air pollution 24-hour forecast”.This application utilizes consolidated real-time environmental data and environmental data that are not real-time,uses big data batch processing to train the prediction model,uses big data stream processing to generate the forecast and presents the forecast through visualization.Case study on this application shows that this platform has the abilities to support “smart environmental protection” applications to consolidate and analyze environmental big data.This paper also designed and analyzed the availability,scalability and operability of this platform.Also,this paper benchmarked its performance in the operating environment.Results show that this platform achieves expected goals.
Keywords/Search Tags:smart environment protection, data consolidation, big data platform, Spark, Hadoop
PDF Full Text Request
Related items