Thermal Power Plant Energy Saving Analysis Based On Spark Big Data Platform

Posted on:2017-01-28

Degree:Master

Type:Thesis

Country:China

Candidate:X Zhang

Full Text:PDF

GTID:2308330503457289

Subject:Control Science and Engineering

Abstract/Summary:

For a long period of time, Chinaâ€™s coal-dominated energy structure will be remain, the coal percent of the total energy will not be less than 50%, the proportion of thermal power is hard to less than 50%.Therefore,faced with the most rigorous emission standards,the power companies must take all measures to achieve energy saving in generation process.Currently, the power companies have accumulated a wealth of historical data during the operation of the boiler, turbine and other equipment.This article aims to discover the energy saving potential of plant, with the help of various data including real-time data and off-line data from a DaTong State Grid Corporation, and build some models towards to some plant-related indicators based on the Spark big data computing platform. The main work as follows:(1) Make a survey research about the current development stage of the plant and the problems faced by. Make analysis toward energy saving potential and possible link where have optimization potential of some plant equipment. And build a prediction modeling about nitrogen oxides with the data within a certain period of time. Good proformance is achieved in the result.(2) Study the Spark core content of RDD, and call the Random Forest algorithm and Gradient Boosting Regression Tree in the Spark Mllib with scala language to complete predictive modeling of nitrogen oxides, then submit the program to the yarn resource management system, and save the results to the HDFS. Compare the two methods from different point.(3) Analysis and compare the pros and cons of different computational framework, and the right hardware and software systems is selected for this study. Simultaneously, comparing the advantages and disadvantages of different storage systems. And select the required storage system. Collecting the machine learning algorithm that can run on the spark platform, and making some test on the platform.(4) Collecting the data from plantâ€™s PI real-time database, then make some cleaning, alignment and related preprocessing to convert them into the Spark Labeled format. Building the hadoop and spark big data platforms.(5) Analysis the modeling result, and extract the tree variable near the root. Explore the relationship between these pollutants and measuring points, and analysis the main economic indicators of the power plant, mining the operating parameters on different load so that coal consumption is small.The result shows that the Random Forest algorithm based on Spark platform could achieve a good predict result, and after the model parameters are set appropriate, the mean square error of prediction can reach 0.0478, its time consumption is also in the acceptable range. Meanwhile, random forest and gradient boosting decision tree algorithm are used to the k-step prediction for nitrogen oxides, and made some corresponding test. It can be seen from the results, the prediction problem, random forest model also has an advantage over gradient boosting tree that can be learning the characteristics of the data, and predictive modeling and nitrogen oxides in a accurate way.In the last, make a envisage and outlook about the energy conservation and the modeling and analysis toward other related aspects of plant. And give some constructive opinions and ideas about future work on this study.

Keywords/Search Tags:

Power Plant, Spark, Hadoop, Big Data Machine Learning, Distributed Algorithm, Decision Tree, Random Forest, Gradient Boosting Regression Tree

Related items

1	Research On Code Plagiarism Detection Model Based On Random Forest And Gradient Boosting Decision Tree
2	Research On Structure-Activity Relationship Of Semiconductor Materials Based On Machine Learning
3	Research On The Application Of Machine Learning In Commodity Recommendation Based On Spark Environment
4	Lending Risk Assessment Solution Based On Machine Learning Classification Algorithm
5	Research On D2D Power Control Algrothims Based On Data-Driven And Optimization-Driven
6	Research On Real Estate Price Prediction Based On Internet Search Data
7	Parallel Research And Application Of Machine Learning Algorithm Based On Cloud Platform
8	Research On Database Intrusion Detection Based On Random Forest
9	Research Of Machine Learning Algorithm For Broadcasting Spectrum Signal Processing
10	Forestnet: A Learning Architecture Combining Deep Networks And Decision Forest