| Reinforced learning is a learning method proposed by humans in the process of pursuing machines with advanced artificial intelligence.The idea is how decision-making agents act according to the environment,which directly reflects the behavioral psychology of humans,that is,constantly learning through stimulation.How to maximize the benefits obtained,so reinforcement learning has been widely used in many control and decision-making problems.Deep reinforcement learning is the introduction of deep learning neural networks for decision-making on the basis of reinforcement learning.The Deep Mind team first proposed the DQN(Deep Q-Learning Net)algorithm,and introduced deep neural networks in Q(Q-Learning)learning to approximate the decision network,Adopt experience replay mechanism for data training,and successfully apply it to large-scale high-dimensional games with astronomical number of states.However,because the DQN algorithm adopts an offline control method,the same strategy is used to update the value function and the action selection,and the value function is updated based on the timing difference,which leads to problems such as slow algorithm convergence and low learning efficiency.As a kind of clean energy,with the strong support of the country,solar energy is used for photovoltaic power generation.However,simply expanding the scale does not necessarily achieve its goal of energy saving and emission reduction.It must also ensure the high-quality,high-efficiency,and healthy operation of photovoltaic power stations,and conduct comprehensive monitoring of the operation of photovoltaic power stations,and timely conduct faults.Diagnosis,these are of great significance to the high-quality operation of photovoltaic power plants.Therefore,this thesis aims at the DQN algorithm,improves some of its existing problems,and tries to apply the DQN algorithm to the fault diagnosis of photovoltaic power plants.The specific work of this paper is as follows:(1)On the basis of learning enhancement learning algorithm and deep enhancement learning algorithm,focus on analyzing the DQN algorithm of deep enhancement learning algorithm based on value function.Because DQN calculation uses experience playback mechanism in the training process,the algorithm is in the learning process.Complete learning is not possible,which makes it difficult to converge,and a logical optimization of experience playback is proposed.(2)SARSA learning has always used a strategy to update the value function and select new actions,instead of taking the most favorable action at the time to find the best future rewards.Compared with Q learning SARSA learning,it is more knowledgeable and more cautious.heads up.Therefore,the SARSA algorithm is introduced into the DQN algorithm,and combined with the experience replay,the sampling probability of each sample is given priority,so that the sampling probability remains monotonous.(3)Learn the principles and related knowledge of photovoltaic power plant systems,analyze common photovoltaic faults,and use the smooth characteristics of photovoltaic data itself for the problem of missing photovoltaic monitoring data,and use compressed sensing algorithms to realize the detection of missing photovoltaic data filling.(4)Through the analysis of the photovoltaic fault diagnosis process and the deep enhancement learning algorithm idea,the two have similarities.Starting from the element environment,status,strategy,return,etc.of the enhancement learning,a DQN-based photovoltaic fault diagnosis model is designed.The focus is on the design of the reward function.(5)Carry out the fault diagnosis experiment based on the photovoltaic fault diagnosis model of DQN.The experimental results show that the application of deep-enhanced learning method to a wide range of fault diagnosis problems can effectively solve the problem of photovoltaic fault diagnosis. |