| With the rapid growth of Global Energy Interconnection construction and the Smart Grid construction,a large number of Internet of things information collection terminal equipments will connecte with the grid.These terminals will produce a lot of collection data-Smart Grid Big Data.In order to meet the demands of smart grid big data analysis,this paper studied the stream processing and batch processing engine construction in smart grid.On this basis,this paper designed an on-line smart grid big dta analysis and decision system.This paper analyzed the source and classification of big data in smart grid based on the research at home and abroad and analyzed the main requirements of large-scale data analysis of smart grid.This paper also studied the theory of distributed calculating which include the distributed computing framework MapReduce,the distributed file system GFS and HDFS,the distributed applications coordination service Chubby and ZooKeeper,the distributed resource management framework YARN and Mesos.This papaer studied three distributed algorithm base models which are the MapReduce iterative analytic model,the BSP calculation model and the SSP calculation model.Then this paper studied the big data stream processing task requirements in smart grid and the application scenarios of three kinds of stream processing engines which are Strom Spark Streaming and Samza.According to the processing engines’ s characteristics and the characteristics of the big data processing and analysis in smart grid the Strom was chosen as the stream processing engine in building the Smart Grid online analysis and decision system.In this paper,the Storm-based VFDT algorithm was applied to analysis the safety of important customer’s power supply and utilization real-timely.In this application the effectiveness of Strom in grid data real time analysising was been proved.Then the extensibility of the Strom stream processing engine was proved by the simulated stress test.After studied the stream processing,this paper studied the data batch processing task’s requirements in smart grid data analysising.Then proposed a big data batch processing solution with the Spark.The solution’s effectiveness and extensibility was proved by the case which analyzing the electrical load with random forest algorithm based on the Spark.At last,this paper analyzed the demand of the smart grid big data online analysis and decision system based on the above research,then designed an overall architecture and all modules for the system which can provide the guidance for the software development work. |