| Due to the non-renewability of traditional energy and the increasing demand for energy,the exhaustion of traditional energy has become a problem that has to be faced.New energy sources have begun to be explored and utilized,and microgrid composed of distributed power sources such as wind power and photovoltaics have become the focus of attention.However,due to the strong randomness of wind,solar and other renewable energies affected by the environment,it poses a huge challenge to the energy optimization management of the microgrid.Artificial intelligence technology has developed rapidly in recent years.As new intelligent algorithms continue to appear and update,related technologies have also begun to be applied in the power industry.In this paper,the deep reinforcement learning algorithm is introduced into the energy optimization of the microgrid to solve the problem of strong randomness of the microgrid.Considering the randomness of the wind,light,and load,the minimum total economic operation cost expectations as the goal,establishes the micro grid optimization scheduling model,respectively using deep Q network(DQN)and deep deterministic policy gradient(DDPG)algorithm to solve the model.DQN algorithm uses neural network to approximate the state-action value function,outputs the Q value of each state-action pair,and selects the action corresponding to the optimal Q value as the optimal strategy.DDPG algorithm separates action and evaluation and sets up an action network and an evaluation network.The action network outputs the action and the evaluation network evaluates the action according to the error,and guides the output of the action network to gradually approach the optimal strategy.In order to reduce the complexity of the neural network,the decision variables were solved in two steps.In the first step,the decision variables related to the future state were output by the neural network,and the remaining decision variables were solved in the second step.Monte Carlo sampling is used to generate multiple sets of training curves to train the neural network,so that the neural network can learn the fluctuation of wind power,photovoltaic output and load until the neural network training converges.The trained neural network can be directly used in the online optimization of the microgrid,and the corresponding decision can be made according to the real-time state information.The effectiveness of the algorithm is verified by a numerical example.Aiming at the energy optimization problem of microgrid cluster system,a microgrid system architecture based on multi-agent system was designed,and a microgrid cluster optimization model considering uncertainty was established.The model was solved by multi-agent deep deterministic policy gradient(MADDPG)algorithm.MADDPG algorithm adopts the framework of centralized training and decentralized execution.Each action network makes decisions according to the environmental information of its own agents,and each evaluation network trains the action network according to the environmental information and action information of all agents.A numerical example verifies the effectiveness of the algorithm. |