Lossless Compression And Cloud Warehouse Storage Research Of Measurement Large Sets Information

Posted on:2015-08-06

Degree:Master

Type:Thesis

Country:China

Candidate:L Guo

Full Text:PDF

GTID:2272330422484551

Subject:Traffic Information Engineering & Control

Abstract/Summary:

PDF Full Text Request

With the application of network scale expansion and smart devices, the smart grid isdeveloping towards the interaction direction of energy and information, and massiveinformation processing and intelligent dispatching. Distribution network characteristic such asmeasurement point increasing, analog quantity changing fast and high fluctuation isparticularly outstanding, such massive information need continuously information storage inthe dispatching and monitoring system shows, which have produced a large data set ofinformation, which is not only the transmission obstacle of information communication, butalso affect the query processing of application information. Directly access the massive dataset information is easy to make the information delay and access speed slow, serious evencause the key information delay or omission, which directly threats the operation safety andreal-time control. So far, the electric power data center has not yet reached smart gridcomputation required levels, including the massive data storage, automation management andhigh availability, processing measurement data information is one of the key problems inintelligent distribution network technology. Effective storage and compression technology forlarge data sets is needed.This paper aims at real time storage and compression processing research of Massiverailway dispatching information flow, which uses the new Hadoop cloud computing and Hivedata warehouse framework technology, to solve the storage problem of electric powerdispatching information flow and ensure grid operation safety and reliable power supply. Tosolve large data compression and storage problem in intelligent scheduling system, usingHadoop framework and Map/Reduce distributed parallel programming model, furthercombining with Hive framework technology, a new distributed cluster lossless compressionmethod based on cloud framework is proposed. Firstly, using the public informationrelationship, objects of dispatching monitoring public information and key data businessinformation flow are established, which solve the integration problem of massive information.Then classified comparing the lossless compression method for dictionary coding andstatistical coding, scheduling host and the monitor server is deployed using cloud computingnode and cluster network configuration. Taking Deflate, Gzip, BZip2and Lzo Losslesscompression coding fused in the cluster node, lossless compression experiment environmentof scheduling monitoring information is established.Taking dispatching section measurement log for example, the test results of differentcompression formats on the same section log sets show: the BZip2cluster compression ratio is higher than the other three ones. when the section log sets exceed three million, thecompression ratio surpasses81%, increasing the section log sets, using Hive data warehouseframework technology, BZip2compression ratio will surpasses85%ï¼Œwhich is more suitablefor compression processing for monitoring history information flow. While Lzo clustercompression method is faster and more suitable for real-time information processing ofdynamic measurement and control process. The results meet requirement of2s dynamicrefreshment engineering application for railway network large data sets.

Keywords/Search Tags:

smart grid scheduling, large data sets, Hadoop cloud computing, hive data warehouse, Map/Reduce, lossless compression, distributon cluster, public information

PDF Full Text Request

Related items

1	Researsh On Distributed Compression Storage And Fault Tolerance Of Monitoring Information Flow Of Power Supply System Of Electrified Railway Based On Hadoop
2	Key Technology Of QAR Data Organization And Analysis Based On Hadoop
3	Study Of Rapid Lossless Compression Technology Based On Spectral Data And Cloud Platform
4	Research On Compressed Access Of Railway Power Supply Monitoring Information Based On Big Data Components
5	Load Modeling Accurate Data Acquisition Based On Cloud Computing
6	A Study On The Task Scheduling Strategy Of Power Cloud Data Center
7	Research On Method Of Fast Distribution And Task Scheduling For Big Data Stream In Smart Grid
8	Research And Implementation On Electory Line Loss Analysis Based On Hadoop Technology
9	Research On Data Processing And Analysis For Electrical Equipment Condition Monitoring Using Hadoop
10	Research On Application Of Cloud Computing In Power System