| With the deepening of the digital transformation strategy in China,the deep integration of the smart grid and new-generation information technologies,such as cloud computing,big data,and artificial intelligence,has endowed user electricity consumption data with new production value.However,while consumers’electricity data create new value,it exacerbates privacy,security,and compliance application risks.It poses challenges to the publication,using and sharing data,limiting the broader application of artificial intelligence models in a smart grid.Traditional privacy data publishing techniques,such as K-anonymity,Tcloseness,and L-diversity,have limitations of specific application scenarios and data characteristics.These techniques can only protect against specific types of attacks,and their processed high-dimensional data need better usability,which does not enable adequate data circulation.In this context,the innovative development of the artificial intelligence model Synthetic Data(SD)provides a new path to solve the problem of limited circulation of privacy data.Synthetic data to replace raw data in publishing,using and sharing users’ electricity can facilitate the circulation of electricity consumption data and promote the realization of its value.The research in this paper aims to promote the value sharing and circulation of privacy data on user’s electricity consumption in the smart grid.It explores the use of Generative Adversarial Network(GAN)models to create high-availability and strong-privacy user electricity consumption data generation techniques to synthesize high-availability data with less privacy risk.The innovative research results of this work are specified as follows:(1)Aiming at the problem that the Generative Adversarial Network is challenging to adapt in the process of generating high-dimensional power time series data,this study proposes a data generative model based on the improved Generative Adversarial Network for electricity consumption side scenarios,named WDCGAN(Wasserstein Deep Convolutional Generative Adversarial Network).Our work focuses on three challenges for Generative Adversarial Networks in generating high-dimensional power time-series data,i.e.,limited feature extraction capability for discrete time-series data,lack of synthetic data’s controllability,and unstable model training process.The model replaces the fully connected layers in the Generative Adversarial Network with multiple convolutional layers,adds auxiliary conditional information to the input layers of the generator and discriminator,and replaces the Jensen-Shannon divergence with the Wasserstein distance.The trained WDCGAN model can generate high-quality,highdimensional power time-series data with labelled information in a stable manner.The experimental results show that the synthetic data generated by WDCGAN have good usability and anonymity,and its generating process has good security.It confirms that WDCGAN-generated synthetic data is an effective strategy to replace consumers’ electricity data for data circulation,using and sharing.(2)Aiming at the problem of limited data of individual electricity suppliers,which makes it challenging to train the WDCGAN model adequately,our work proposes a data generation model based on the deep combination of WDCGAN and federated learning,named FL-WDCGAN(Federated Learning WDCGAN).Based on the structural characteristics of the Generative Adversarial Network,the model builds a single generator network in the central server while building a discriminator network locally in each edge supplier,realizing the central generator’s learning of distributed supplier data features in a federated and adversarial manner.Compared to the traditional federated GAN model,the FLWDCGAN model has a more concise structure,robust data privacy protection,and higher communication efficiency.The experimental results and the security analysis of FL-WDCGAN show that the synthetic data generated by FL-WDCGAN has good usability and anonymity.Meanwhile,its generating process has privacypreserving properties.FL-WDCGAN confirms that the realization of adequate training of Generative Adversarial Network models with supplier data retained locally provides an effective strategy for improving the diversity and availability of synthetic data.(3)To address the problem of privacy leakage of model parameters in the training and sharing process of the FL-WDCGAN model,our work proposes a privacy-preserving scheme based on FL-WDCGAN with Rényi Differential Privacy(RDP),named RDP-FL-WDCGAN.This scheme combines the FLWDCGAN with Renyi Differential Privacy and only de-sensitizes the gradient information of the central generator of the FL-WDCGAN model in a way that realizes the reduction of the amount of noise injected into the model parameter information under the premise of model parameter privacy security.RDP-FLWDCGAN can preserve the updated direction of the original gradient,which is conducive to enhancing the usability of the differential privacy model.The experimental results show that the RDP-FL-WDCGAN scheme not only ensures the usability of the synthesized data but also significantly enhances the privacy and security protection of the model parameters.This positively motivates suppliers to participate in generative model training and enables synthetic data sharing.With the above research,our work appropriately addresses the conflicts between smart grid users’ electricity consumption data sharing and privacy leakage.With the powerful capability of artificial intelligence models to learn data,synthetic data is highly similar to the original data,which can replace the original data for sharing with the public.The three Generative Adversarial Network modeldriven data generation schemes for high-dimensional power time series data sharing scenarios proposed in this paper realize the value sharing and circulation of privacy data.Our work provides new possibilities for sharing and circulating privacy data in electricity and other industries. |