Analysis And Processing Of Power System Log Data Based On Spark

Posted on:2018-07-30

Degree:Master

Type:Thesis

Country:China

Candidate:J L Tu

Full Text:PDF

GTID:2322330542951810

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the continuous expansion of the scale of power dispatching automation system,the log data of real-time processing of power system increase dramatically.Affected by the disk performance,the log data can not be processed in time,which resulted in delay and could not meet the real-time demand.Therefore,memory processing capacity is a good choice.In recent years,with the rapid development of cloud computing and big data technology,the use of big data technology to obtain potentially useful information for massive amounts of real-time log data has received widespread attention.And Spark stands out among the open source computing framework.As its upper tool,Spark Streaming provides batch-based real-time processing,which can meet the demand of the dispatching automation system for a certain period of real-time data processing.However,subject to the changes of data stream reception rate and other operating environment,static batch interval and block interval will lead to higher end-to-end latency and processing time,and even cause system instability.Addressing the problem of Spark Streaming,this thesis has a deep research on the impack of its interval.This thesis analyzes the processing flow of Spark Streaming in depth.Combining the effect of burst log flow on dispatching automation system and considering the characteristics of log data,data filtering and format layouts can reduce system processing time.As the effect of batch interval on the end-to-end latency of single query task,a dynamic adjustment algorithm based on Fixed Point Iteration is proposed,which has no need to determine the step size to converge to the optimal batch interval quickly,so that the system end-to-end latency can achieve the best.Considering the effect of block interval on the resource utilization and processing time of multi-query tasks,a dynamic adjustment strategy based on greedy algorithm is proposed,which can find the optimal block interval in time to reduce the processing time of multi-query tasks,so that the demand to achieve a rapid analysis of real-time log data flow to monitor,analyze and predict the power dispatching automation system all-round and multi-angle can be meet.Based on the existing Spark Streaming,this thesis improves and develops the DASpark Streaming system to achieve the above functions and bulids the experimental platform.We compare the performance with existing Spark Streaming by the real-time log test data from a prefecture-level dispatch center.The result shows that the improved DASpark Streaming has obvious advantages to reduce the end-to-end latency and processing time of the system effectively and improve the resource utilization.

Keywords/Search Tags:

dispatching automation system, log data, Spark Streaming, interval, latency

PDF Full Text Request

Related items

1	Design And Implementation Of Stream Data Anomaly Detection Framework For Power Dispatching Automation System Based On Machine Learning
2	Research And Implementation Of AFC System's Real-time Data Distributed Processing
3	Research On Online Detection Of Power Abnormality Based On Spark Streaming
4	Research On Application Of Elasticsearch In Grid Dispatching Data Management
5	Design And Implementation Of Load Forecasting System For Power Dispatching Based On Spark
6	Research And Application Of User Behavior Anomaly Detection Algorithm In Smart Home Environment Based On Streaming Data
7	Research On Business Anomaly Detection Method Of Power Dispatching Automation System Based On Machine Learning
8	Research And Implementation Of Real-time Fake License Plate Detection System Based On Big Data Analysis
9	Dynamic Modeling And Optimization For Combustion System Of The Boiler In Power Plant Based On Spark
10	Design And Implement An Edge Streaming Data Processing Framework For Autonomous Driving