Font Size: a A A

Design And Implementation Of Big Data Foundation Platform For Hydropower Enterprises

Posted on:2019-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:J FengFull Text:PDF
GTID:2359330569995899Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of hydropower enterprise informatization,enterprises have accumulated a large amount of structured data and unstructured data,and there are also potential massive real-time data that can be collected.Nowadays,data is an intangible asset of an enterprise.The need for enterprises to use data-driven development is urgent.At present,hydropower enterprises basically adopt traditional architectures when building data centers,which have the problems of poor scalability,high construction costs,high operating costs,support for single data types,and low data processing efficiency.It is unable to meet the demand for high-speed growth of all types of data storage and processing in the era of big data,and cannot support the needs of hydropower enterprises for the deep use of potential data assets under the era of big data.This thesis aims at the current problems,sorts out the information data resources of hydropower enterprises,completes the analysis of the needs of hydropower enterprises' big data foundation platforms,and completes the design and implementation of the big data foundation platform of hybrid architecture.The main platform of big data includes two data integration layers and data storage layers.The data integration layer of the big data foundation platform addresses the three types of data integration requirements for hydropower enterprises' information systems,structured data in automated systems,unstructured data and real-time data,from data scenarios,technical methods,data features,and trigger mechanisms.The dimension of the processing steps is summarized.The data integration layer automatically collects,sorts,cleans,converts,and stores data to the data storage layer of the big data base platform through interface tables,interface data files,interface calls,and message queues.The data storage layer of the big data foundation platform includes a data warehouse platform,a distributed data platform,and a streaming data platform.Based on the domestic database to build the Gbase 8T data warehouse platform,the data warehouse was partitioned and divided into Origin Data Model,Foundation Data Model,Aggregation Data Model,Mart Data Model.The data was integrated and summarized in the data warehouse through ETL technology to realize structured data.Classify and store according to the business subject domain.Based on Hadoop's distributed data platform,the HDFS distributed file system is used as the file format,and HBase's distributed column database is used as the database to meet the massive data storage and concurrent demand.The distributed data platform is partitioned and divided into unstructured data areas and streaming data dump areas.The unstructured data area implements the storage of unstructured data and is associated with structured data.The stream data dump area implements persistent storage of massive real-time message data.Based on the "Kafka+Storm+Redis" platform to build stream data platform,it can perform various real-time messages in the data source layer to achieve efficient,reliable,real-time streaming processing and storage.The Big Data Foundation Platform realizes centralized storage and integration of all types of river basin data,has high data processing capabilities,resolves the situation of data islands in each enterprise's internal information systems,and lays a foundation for subsequent data mining and data driven enterprises.
Keywords/Search Tags:Big data, Distributed data platform, Streaming data platform, Hydropower Enterprises
PDF Full Text Request
Related items