| Coordinated secondary voltage control(CSVC) is a multi-objective control problem, but itis difficult to obtain the optimal control using the traditional quadratic programming methodbased on the weighted sum of the objective functions, so this paper improved the traditionalCSVC model, solving it from the perspective of multi-objective optimization, setting it’sobjective function as to minimize the voltage deviation of the dominant node and the varianceof generator reactive power output proportions in partition, coordinating the action ofcapacitors/reactors in substations and automatic voltage regulator (AVR), and finallyestablished the multi-objective coordinated secondary voltage control(MOCSVC) model.According to the control features of MOCSVC as well as the requirements of onlineoptimization, this paper presents a new method for solving MOCSVC, called state sensitivitybased reduced reinforcement learning (SSRRL). In order to accelerate the propagation speedof the award value, SSRRL proposes a new definition of the state function, and achieves theinitial point positioning and autonomous compression of the state space through global searchbefore the main loop, greatly improving the search efficiency. Moreover, SSRRL use theadaptive criteria of learning phase division based on state sensitivity during the main loopsearch, balancing the search and the use of the learning experience, and take the actionselection mechanism which extend the variable selection range of single action to all controlvariables, making the search in a limited cycle number to cover the entire state space as muchas possible. Besides, in order to reflect the current preference information of s ystem, thispaper introduces the concept of real-time weight coefficient, while selecting the optimalcontrol from the Pareto frontier (PF) according to it.The example analysis validates the superiority of the SSRRL and the real-time weightingcoefficient from four aspects including quality of PF, optimization time, convergence rate andcontrol effect, through time-domain simulation of the actual operating data of a largeprovincial power grid. The analysis results show that SSRRL can meet the online optimizationdemand of convergence rate and calculation time, while getting superior performance ofPareto front, what’s more, MOCSVC with the real-time weight coefficient have an excellentcontrol effect, which is a good indication of the current system preferences. |