| With the rapid development of microgrid,energy station and energy Internet,as well as the country’s vigorous efforts to promote renewable energy for carbon peak and carbon neutral,distributed renewable energy such as photovoltaic and wind power has realized efficient grid connection.Energy trading in the electricity market has gradually opened up.The requirements of microgrid for intelligent,efficient,safe and economical energy trading and energy management are gradually increasing.Since traditional methods are difficult to deal with the uncertainty brought by dynamic changing environment,research on energy trading and management methods with adaptive learning ability and high generalization ability is an important topic in the development of microgrid.Deep reinforcement learning is an important branch in the field of artificial intelligence.It can learn hidden features in high-dimensional information by fitting nonlinear value mapping through deep learning without establishing accurate models.Reinforcement learning is used to optimize control decisions in the interaction between an agent and the environment.Since reinforcement learning follows the Markov properties,it is suitable for the study of energy transaction and management optimization.The main work and contributions of this thesis are as follows:(1)Aiming at the composition and constraints of microgrid,the microgrid system model and energy flow transaction mechanism are designed based on reinforcement learning.The state changes of each microgrid and the energy flow of trading and management are in a continuous form,however,most studies use discrete form,which seriously affects the practicability of the method.In this thesis,the continuous state and action space are designed.The two-market trading mode is employed.Internal energy is traded through the regional hour-ahead market to improve the utilization rate of renewable energy generation.Meanwhile,futures contract trading and spot trading in the wholesale market guarantee the timely supply and effective utilization of surplus energy,so as to meet the demand and supply balance of each microgrid.(2)A value-decomposition based deep deterministic policy gradient algorithm(V3DPG)is proposed for regional microgrid energy trading.Actor-Critic framework is used to design continuous system state and action space.Meanwhile,according to the uncertainty and fluctuation characteristics of renewable energy generation and user demand,the recurrent neural network is introduced in Critic net,and the Burn-In initialization method is used to effectively regain the hidden state of data,and realize the implicit prediction of unknown information.In order to ensure that microgrid can make energy trading decisions based on its own local observations without global knowledge and explicit model,we introduce the idea of value decomposition in the training process,and obtains the trading strategy of maximum profit while guaranteeing the privacy and independent decision-making ability of regional microgrid.(3)For the energy management of regional microgrid,we introduce energy storage system in the system,and improve the V3DPG algorithm proposed in energy trading problem.In traditional reinforcement-learning-based studies,storage system is usually regarded as a buffer device to fill the gap between supply and demand after energy trading,but the interaction between energy trading and storage is ignored.In this thesis,energy storage system is regarded as a virtual market with operation constraints,which has the same status as the wholesale market.State of charge and virtual price are added into the state set of V3DPG,which can be an important reference for calculating value of policy and optimizing decision of microgrid.Then,according to the energy source of chargling and usage of discharge,the storage reward is designed to balance the weight of the influence of virtual market and wholesale market on microgrid’s own economic benefits,so that the microgrid can effectively use the price spread between the two markets to achieve efficient trading.Finally,the utilization of renewable energy is improved while the dependence of microgrid on wholesale market is reduced,and the optimal decision of regional energy trading and management is realized. |