Font Size: a A A

Design And Implementation Of Data Cleaning Subsystem For Big Data Of Power Grid

Posted on:2019-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y LeiFull Text:PDF
GTID:2348330545455576Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Smart Grid uses the physical grid as the core,and combines it with communication technology,computer technology and information technology to produce a new type of power network.As countries around the world start accelerating the investment and building of large-scale smart grid,most of the regions have set up a variety of information management systems,including distribution geographic information systems,user information collection systems,and on-line monitoring systems for distribution lines.Through these systems,you can gather a wide range of information data.In recent years,using data mining techniques,researchers can researchers can intelligently manipulate vast amounts of data,discover the inherent laws embedded in historical data and predict future events that may occur in the future.Due to the frequent occurrence of incomplete and inconsistent in power data,it is impossible to directly conduct data mining technology on these data,so that the data quality and data cleaning of power data are getting more and more attention and in-depth research.In the background of big data and smart grid,this paper designs and implements a data cleaning subsystem for big data of power grid to provide users with a set of data storage,data quality assessment and data cleaning methods.Compared with the traditional data preprocess platform,this data cleaning subsystem,aiming at the characteristics of power big data,sets up a user-friendly interactive interface,process grid data in various formats,and can improve data quality through data cleaning operations.The data cleaning subsystem not only provides a wide range of data cleaning techniques to improve data quality of power grid data but also provides a data quality evaluation model.Firstly,this paper describes the background and significance of this system,and studies the related technologies in the design and implementation process.Then,this paper analyzes the system requirement,and divide it mainly into three aspects,including data access,analysis and cleaning.After that,this paper conducts research and finds solutions on the key issues when implementing the system.The key issues consist of the data cleaning method using the similarity of time series aiming at load data in power grid,and the data quality evaluation of power data.Then,based on the understanding of the requirements and key issues,this paper designs the overall architecture and function modules of the data cleansing subsystem,analyzes the typical scenarios in the key modules,and illustrates the implementation of the key algorithms.Finally,the deployment and testing of the data cleaning subsystem are described to verify the correctness and completeness of the system.The deficiencies in the research work are also described,and the future research on the subject is prospected.
Keywords/Search Tags:big data of power grid, data cleaning, data quality
PDF Full Text Request
Related items