Smart meters integrate functional modules such as energy measurement,data collection,and remote charge control.They have the advantages of simple operation and multiple functions,but at the same time,due to their increasingly rich functions and increasingly complex structures,their types of faults have become diversified.The accurate judgment of the fault type of smart meters can help guide operation and maintenance personnel to formulate reasonable maintenance measures and reduce the time for fault maintenance,thereby improving the stability of the power collection system and reducing the operation and maintenance costs.The classification method based on machine learning is an effective way to solve the multi-classification problem of smart meter faults.However,the existing smart meter fault sample data presents an imbalance characteristic,which is essentially an imbalanced multi-classification problem.Solving the imbalanced multi-classification problem is a hot and difficult problem in the field of machine learning research.In-depth study of this problem and corresponding solutions can not only improve the accuracy of smart meter fault classification but also provide feasible ideas for solving problems in many application areas with similar data distribution characteristics such as sentiment classification and disease classification.It has important theoretical significance and practical application value.Based on this,this paper studies the multi-classification method of smart meter faults based on machine learning.The main work of the paper is as follows:Firstly,a classification method for smart meters based in one-versus-all(OVA)framework is studied.According to the characteristics of the existing meter fault data,the original data is analyzed and processed by means of data cleaning and feature engineering techniques.On the basis of preprocessing of the target fault data,in order to solve the problem that the number of fault types is unevenly distributed,a differential partition sampling ensemble method in the OVA framework is proposed by combining the binarization technology and the imbalanced learning method.After transforming the original multiclass data into multiple binary data sets,the algorithm uses the number of majority samples and that of the minority samples in each binary training dataset as the upper and lower limits of the interval respectively and simulates the construction process of equal difference sequence to set the sampling number.In order to reduce overfitting and majority samples’information loss,in each iteration,according to the characteristics of sample distribution,Safe-Random Undersampling(SRU)and BR-SMOTE are proposed to balance the number of positive and negative samples,and then the classification models are established.Experimental research is performed using public data sets and the actual meter fault data set.The experimental results show that the proposed method can effectively solve the problem of multiclass imbalanced data classification.Then,a classification method based on multiple classifier systems is studied.According to the different characteristics of the multiple submodels,a dynamic fusion ensemble classification algorithm based on the trust score is proposed.The algorithm improves the diversity of the overall classification model while ensuring global accuracy.In the selection stage,the models in the classifier pool are sorted according to F-score1 based on individual capabilities and then selected again according to the standard Double fault(DF)based on group capabilities.In the aggregation stage,according to the distribution of various types of samples in the local area of the test samples,the prediction results with low reliability can be filtered dynamically to reduce the interference of the classification model with poor performance on the final prediction results of the test samples.Experimental analysis is performed on public data sets and the actual meter fault data set.The experimental results show that the proposed method can effectively improve the classification effect of multiple classifier systems.Finally,a multi-classification model combining imbalanced learning method and multiple classifier systems construction methods is studied.In order to further improve the accuracy of fault classification,a dynamic balanced ensemble multi-classification model based on the OVA framework is proposed.This model combines binarization technology and uses differential partition sampling to balance various types of samples to obtain multiple balanced training sets.After establishing multiple classification models based on these balanced sample sets,for each binary classification problem,the model uses the dynamic selection fusion method based on the trust score to build multiple classifier systems to predict the test samples online.Experimental analysis is performed using public data sets and the meter fault data set.The experimental results show that the proposed method can effectively improve the classification accuracy of the overall model. |