| The power grid is a crucial energy transmission channel and a supply-demand docking platform and critical support for the sustainable development of electric energy.It plays an irreplaceable fundamental role in the modern energy supply system and is related to the country’s energy security.The continuous increase in the scale of our country’s power grid construction,the constant increase in the proportion of clean energy,and the increase in ultra high voltage transmission lines have caused the complexity of the grid to continue to increase.On the one hand,the security and stability characteristics and mechanisms of the power system have become increasingly complex,and daily safe operation facing more and more severe challenges;on the other hand,it also puts forward higher requirements on the automation level of the power grid operation and dispatch management.Therefore,based on some new methods and new theories to study the tie-line power of the grid adjustment method is of great significance for reducing the work pressure of grid operation dispatchers,improving the automation level of grid dispatching management,and further promoting the construction of smart grids.This paper innovatively introduces deep reinforcement learning methods from the perspective of data science,uses deep neural networks to mine the internal characteristics of power operation data,and fits end-to-end direct control strategies from environmental information to tie-lie power adjustment actions.The proposed method breaks the inherent limitations of traditional methods.The main research results are as follows:1.An automatic adjustment method for power flow calculation convergence based on an improved deep Q network algorithm is proposed.The improved deep Q network algorithm introduces a new target amount calculation method,adds a priority experience replay mechanism,and designs a dueling neural network structure,which overcomes the shortcomings of traditional algorithms such as overestimation of Q value,low data utilization,and unstable model training process.In addition,specific reinforcement learning environment modeling was made for the convergence adjustment of power flow calculation under different load levels of the power system,and the overall training program of the deep neural network model was constructed.The final model can automatically give a generator start and stop adjustment scheme for power flow calculation convergence under different load levels.This scheme can ensure that the active power output of the balancer in the system is within the rated range,and the system network loss rate is also maintained at a low level.2.A method for tie-line power of grid automatic adjustment based on the proximal policy optimization algorithm is proposed.The proximal policy optimization algorithm is easy to deploy and has good versatility and effectiveness.The method is firstly based on the Markov decision process to model the environment of the tie-line power of the grid adjustment problem.It focuses on designing the representation form of the environmental state,the form of the reward function,and the way of strategy execution.In addition,because of the problem of large adjustment space for generator active power output,generator selection and power compensation methods are designed.The final model can realize continuous and flexible automatic adjustment of the tie-line power of the grid under the condition of only the target tie-line and the target power value.While ensuring the adjustment accuracy,it simplifies the adjustment process,avoids the tedious manual calculation process,and reduces the dependence of the tie-line power adjustment task on the experience of the practitioners. |