Font Size: a A A

Application Of Feature Engineering In Detecting Power Stealing User

Posted on:2021-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y SuFull Text:PDF
GTID:2392330605974580Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In recent years,the power consumption of the society has been increasing rapidly,but the number of cases related to power stealing has also increased year by year.Power stealing not only brings huge economic losses to the country and power supply enterprises,but also affects the power supply safety and order of society.With the efforts of the power industry in the construction of smart grids,data mining began to be applied to the detection of power stealing.Based on the user' s daily power consumption and electric representation data from 2014 to 2016,this paper proposes an effective method for detecting electricity theft users under the framework of feature engineering as the main research content.In the feature engineering,the data set is cleaned scientifically,and the missing values are filled by the regression method.At the same time,duplicate records and other types missing values are eliminated,ensuring that each user has at most one record per day.Then,the typical differences between power stealing users and non-power-stealing users on the.power consumption line graph are found.It is obviously that the periodicity of power consumption and the length of time for maintaining low power consumption are two distinct features that can reflect the difference between power stealing users and non-power-stealing users.In the process of feature extraction,based on the analysis of the data and the.learning of related papers,six types of features are extracted,including simple cumulative features and statistical features.In order to mine the value of the data as much as possible from the periodicity,the rest are mainly describing the rate of change of adjacent values and the characteristics of correlation.In features selection,the group of feature type and the group of time interval are respectively used to form a feature subset by using the improved single optimal features combination method and the backward search method,and the effect of the feature subset is compared by the wrapping method.In addition to this,a method of selecting feature subsets by using feature importance when modeling full features is proposed.Finally,the feature selection schemes proposed according to the feature type and the importance degree of the features are combined,and the combined features are separately modeled and five-fold cross-validation is performed.Comprehensive consideration of feature selection scheme,AUC mean value and calculation speed to find the best feature selection scheme.To further validate the benefits of the LightGBM model,the LightGBM model was compared to the XGBoost,Random Forest,and Logistic Regression models.By comparing the LightGBM model with other commonly used anomaly detection models in the fitting effect and calculation speed,the feasibility of the LightGBM model in the detection of electricity theft users is proved.
Keywords/Search Tags:Detection of electricity theft users, LightGBM, Feature engineering
PDF Full Text Request
Related items