Font Size: a A A

Based On Hadoop Electric Offline Patterns Of Data Mining System Design And Implementation

Posted on:2015-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:K S LinFull Text:PDF
GTID:2428330491951286Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the explosive growth of data presented,the data processing is increasingly important.Following the development of cloud computing again after leading the global technology revolution,the concept of big data continue deep into the social life of every corner.With the deepening of the practice of big data in Hadoop platform as the core of a large data processing technology has also been considerable development and updating Hadoop ecosystem are becoming increasingly rich,providing a wealth of tools for different scenarios and different functional requirements.Electricity supplier industry with the continuous development of business,its accumulated data has become increasingly important,especially in the case of scale evolving.How to handle orders and warehouse data currently stored in the platform has become an important issue electricity supplier industry,but also by the major electricity supplier industry,companies are increasingly concerned about the development and use of big data,the use of the driving force data,making it the company's breakthrough new growth point and core competitiveness.This paper studies the design and implementation of the electricity supplier offline data mining system,in-depth analysis of the Hadoop big data processing platform,the proposed system solutions based on Hadoop platform offline data processing system of the electricity supplier.The main advantage of this scheme HDFS as a mass data storage systems while leveraging Hadoop MapReduce computing framework as the data provided by the calculation mode.After solving the data storage and computing models,focuses on the Hadoop data processing model for the electricity supplier offline data mining system to select the appropriate model for data processing operations.Electricity supplier offline data mining system detailed needs analysis in this paper,the proposed system architecture is based on Hadoop data processing platform.Electricity supplier offline data mining system consists of seven main modules:data conversion storage module,data ETL modules,data view to show the module,data flow computing task management module,platform monitoring module.In this paper,these seven main modules of the detailed design and implementation.Finally,based on this data mining system completed a warehouse ETL and electricity supplier's order data and order data mining instance basic test on this basis.
Keywords/Search Tags:Hadoop, Big data, hive, Data Mining, Data Warehouse, Electricity supplier data
PDF Full Text Request
Related items