Font Size: a A A

Design Of Building Energy Consumption Big Data Storage And Analysis Platform Based On Hadoop

Posted on:2021-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:H LianFull Text:PDF
GTID:2392330602982516Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the acceleration of urbanization,the energy consumption of buildings is increasing rapidly.The research on building energy has become the focus of energy conservation and emission reduction in China.In recent years,the development of technologies related to big data and internet has provided data support for building energy conservation.These technologies can be used to collect,store and analyze the data of building energy consumption.This can reflect the running status of the building and discover the law of building energy consumption,so as to realize the efficient use of building energy.With the popularization of smart electricity meters and the development of energy-using information collection system,building energy consumption data is getting larger and larger.As a result,the performance of traditional relation database in the storage,query and processing of building energy consumption data has become a bottleneck.And it is also difficult to satisfy all kinds of new demands in the era of big data.The research of this paper focuses on building energy consumption big data storage and analysis platform based on Hadoop.The main contents are as follows:(1)A three-layer architecture for building energy consumption big data storage and analysis platform was designed.This paper studied the current wide-used big data platform architecture.The platform is designed according to Lambda architecture.In the batch layer,HDFS provided the underlying data storage services.MapReduce and Spark provided offline computing services.The running mode of Spark was configured as Spark on YARN mode.YARN performed unified scheduling and computing resource management for cluster computing services.This solved the problem that the Spark Standalone mode only supported simple and fixed resource allocation policy.In the real-time processing layer,Spark Streaming and Kafka are integrated,which can be used for energy consumption prediction,energy consumption alarm and other Streaming applications.In the service layer,HBase and Hive provide data query and analysis services.Hive was configured to support the dual computing engine mode of Hive on MR and Hive on Spark for users to switch according to their computing needs.(2)The number of jobs adjustment method based on YARN resource scheduler was designed.This method could dynamically adjust the number of MR jobs in the running state of the cluster.This could eliminate the process of manual parameter adjustment.Experiments showed that this method can reduce the MR operation time about 53%and 14%based on capacity scheduler and fair scheduler,respectively compared with the default configuration.(3)A general RDD weight calculation model was proposed to represent the importance of corresponding RDD.A Spark automated checkpoint setup method was designed to eliminate the need for developers to rely on experience to select checkpoint timing and checkpoint data Experiments showed that the Spark automated checkpoint setup method can improve the recovery efficiency of Spark applications.
Keywords/Search Tags:Big Data, Building Energy Consumption, Lambda Architecture, Hadoop, Spark
PDF Full Text Request
Related items