| In the era of big data,the power system is also developing towards intelligence and informatization.At the same time,the safe and reliable power supply of smart grid is also facing new challenges.Among them,in the process of power transmission and distribution,there are some electricity stealing behaviors of enterprises or individual users,which seriously damage the economic interests of the power grid company.Moreover,the loss caused by power stealing makes the load of the distribution network in the actual operation greater than the expected load,which brings great hidden dangers of power supply security to the power system.Nowadays,the way of electricity stealing is more advanced,diversified and secretive,which is increasingly contradictory with the relatively backward traditional abnormal electricity consumption detection technology and electricity consumption management mode.Therefore,it is imperative to study the abnormal power consumption detection method to meet the development needs of the times.The construction and development of strong smart grid and ubiquitous power Internet of things provide new opportunities for data-driven electricity theft detection methods.Through in-depth analysis and mining of user electricity load data,it can find the abnormal electricity behavior patterns of electricity thieves,which is of great benefit to improving the efficiency and accuracy of electricity theft detection.In this paper,the ensemble learning model is used to detect and analyze the abnormality of power user load data,and the following research work is carried out:(1)The general flow and steps of data processing are analyzed from the perspective of data mining,and the algorithm modeling and model evaluation of abnormal power consumption detection are carried out.(2)A large number of data preprocessing operations are carried out on the load metadata,including abnormal value screening,missing value filling and standardization.Through appropriate data preprocessing,the data quality can be effectively improved and the impact of noise in the metadata set on the algorithm model can be reduced.Aiming at the problem that the imbalance of power theft data leads to the bias of the model algorithm to the side with a large amount of data,smote oversampling technology is used to enhance the samples of the original power theft data,and PCA dimensionality reduction method is used to specify the dimension of the characteristic attributes,which not only avoids the problem that the "dimension disaster" annihilates the effective attributes,but also saves the running time of the algorithm.(3)The Ada Boost and Bagging ensemble learning model based on power anomaly detection is designed to give full play to the wisdom of the group and generate a better detection model.(4)Genetic algorithm is used to optimize the hyper-parameter of the model to further improve the detection effect of the model. |