| In order to adapt to the rapid economic and social development and deal with the problem of environmental degradation,the modern power grid has gradually transformed into a new form of energy interconnection under the deep integration of information and Physics under the promotion of renewable energy technology innovation and energy transformation goals.However,the large-scale energy interaction and information fusion among multiple regions complicate the structure and operation mode of power grid,and the influence of uncertain output under the environment of high penetration of new energy makes the stable and coordinated operation of the system more difficult,aggravates the active power fluctuation in the system,and brings greater pressure to the frequency control and tie line exchange power balance of the system.Since the dynamic balance relationship between load and active power of generating units in power grid directly reflects whether the system frequency is stable or not,it is an effective means to realize the dynamic stability of system frequency and power rate by adopting automatic generation control(AGC)technology to strategically adjust and optimize the active power to balance load disturbance.However,in the above situation,the traditional AGC strategy is not enough to meet the challenges.Therefore,it is particularly important to improve the optimal control strategy of AGC system and improve the ability of the new scheme to deal with uncertainty.In this paper,the cooperative control strategy under the framework of multi-region distributed AGC is studied.Considering and utilizing the advantages of deep reinforcement learning method in dealing with uncertain decision-making problems,the AGC optimal control problem is transformed into an uncertain dynamic decision-making problem under the framework of reinforcement learning which includes a series of elements such as observation state,scheduling action and reward function The deep reinforcement learning AGC optimization strategy,MIAC strategy and CLAC strategy respectively deal with the problem of over estimation of target value and lack of exploration space integrity and diversity in the process of strategy optimization,which can improve the dynamic control performance of the system in the dynamic environment of power grid with uncertain new energy output and strong random fluctuation of load demand,so as to improve the dynamic control performance due to strong random disturbance A series of unstable states of the system.The main contents of this paper are as follows:(1)This paper briefly describes the practical significance of AGC co control and related optimization strategy research in the complex and strong random interconnected large power grid environment,and the progress and analysis of the problem by domestic and foreign scholars.(2)The principle of AGC and its basic model are described,and then the AGC control system is designed from the aspects of optimization objectives,model transformation rules based on reinforcement learning framework,distributed control mode,regional energy response and unit structure.(3)The reinforcement learning method and deep reinforcement learning method are briefly introduced,and the key technologies or difficult problems related to this study are elaborated in detail.(4)Then,a deep reinforcement learning algorithm based on actor critical structure and multi network excitation,namely MIAC,is proposed as the control strategy of automatic generation control.Considering the optimal objective decision-making in the control process,the quality of strategy mining and efficiency of experience exploration are improved through the incentive heuristic updating mechanism of AC strategy.At the same time,an updating method of relatively minimizing the value of Q-value function is adopted to reduce the optimization deviation,guide the strategy objective to the balance of exploration and utilization,and then obtain the optimal collaborative control of AGC.Through the simulation of the improved IEEE standard two area power system model and integrated energy system model,the results show that the proposed MIAC strategy has good dynamic control performance and migration generalization ability,can realize the rapid adaptation and stable optimization to the complex power grid strong disturbance environment,and can effectively solve the random disturbance problem under the background of integrated energy system.(5)Finally,the complex energy conversion mode of the integrated energy system and the unbalanced supply-demand relationship under the penetration of renewable energy constantly bring strong random disturbance to the power grid,resulting in the poor performance of AGC integrated control.In order to explore AGC optimal cooperative control method as the goal,based on MIAC strategy,this paper develops a deep reinforcement learning CLAC strategy,which is oriented to different exploration horizons,has the advantage experience sharing mechanism among multiple groups of learners,and can continuously coordinate the key behavior strategies of learners,so as to improve the system state and realize the optimal coordinated operation.The simulation results show that CLAC can consider a more comprehensive exploration process and improve the diversity of behavior strategies.Compared with other algorithms,CLAC has better convergence characteristics and learning performance.It can quickly obtain the optimal solution of AGC multi region cooperation and significantly improve the dynamic control performance of the system under strong disturbance environment The properties of the system were studied.(6)This paper summarizes the work done in this paper,considers the shortcomings of the research content,and looks forward to the new development and new challenges in the field of automatic generation control in the future. |