| Smart grid has become the future development trend of power system.Since building load occupies a large proportion in the power system and has strong dispatchability,building energy management is one of the main implementation objects of demand-side energy management of power system.In this paper,we take intelligent buildings as the research object,in order to make them better adapted to serve the demand-side energy management of the power system,and to reduce the peak load and cost of electricity consumption by scheduling the electricity consumption load without compromising the comfort of intelligent building users.In this paper,we focus on the energy optimization of smart building users,and study the load characteristics and energy optimization of smart building power equipment.The system structure of smart buildings with photovoltaic power generation is elaborated,and the fixed load,interruptible load,panning load and electric vehicle in smart buildings are modeled according to the characteristics of electric devices in smart buildings and users’ habits.Meanwhile,in order to apply the reinforcement learning algorithm to the smart building swarm energy optimization scenario,a smart building swarm Markov decision process is established,which includes the observed amount of smart building intelligences,actions,rewards obtained from performing the actions,and the state transfer function of the intelligences interacting with the environment.Based on the historical load information of users,an energy optimization method for smart building clusters based on Deep Policy Gradient(DPG)algorithm is established in the context of time-sharing tariff,which combines deep neural network and reinforcement learning algorithm to apply to smart building cluster environment for energy optimization of smart building clusters,which can greatly improve the algorithm model’s It can greatly improve the applicability and convergence speed of the algorithm model.A deep strategic gradient algorithm framework for energy optimization in smart building clusters is established to solve the problem of scheduling and control of power-using devices in smart building clusters,and to solve the optimal control strategy to ensure the purpose of reducing users’ energy costs and smoothing out power consumption peaks without affecting users’ energy demand and reducing users’ comfort.The model is optimized by combining the probability distribution of equipment operation states in an interactive mode of offline and online learning.Using the DQN and DPG algorithms for comparative analysis,the algorithm solution results show that the DPG algorithm is closer to the optimization objective than the DQN algorithm for the smart building cluster energy optimization problem,and has the superiority of faster convergence and better stability. |