It is important for many fields to recognize the image and make the decision to output the corresponding action or decision,especially in the automobile unmanned vehicle,medical robot and so on.Deep learning is a technique for unsupervised image recognition without additional manual tagging in the middle.Reinforcement learning is a good strategy for learning continuous decision problems by optimizing accumulated future reward signals.The combination of the two makes the deep reinforcement learning to realize a new algorithm to recognize image to action selection.It can directly according to the input image to realize control function,it is a kind of similar to human learning methods of artificial intelligence method,its characteristic is to human beings by the sensory information such as vision,then through deep neural network output corresponding action directly.Deep reinforcement learning has the potential to enable robots to achieve truly autonomous learning skills.Deep reinforcement learning has achieved remarkable results in theory and application,which is of great significance to the development of artificial intelligence.Based on the theory of depth of reinforcement learning related to identify the image in video games,and according to the different image information output corresponding action strategy,for example,in the game,down,left,right,attack,etc.The specific work contents of this paper include:(1)this paper used in reinforcement learning is Q-learning algorithms,but sometimes learn algorithm does not conform to the actual function,the action of high value because it contains a high tend to estimate value function maximization step.In previous studies,overestimation was not sufficient for effective and flexible function approximation and noise.Studies show that overestimation occurs when the action value prediction is inaccurate,which can have a negative effect on the stability of training especially in practice.This article USES the Double step Q-learning,it can be applied to arbitrary function approximation,including Deep neural Network application in Double step Q-learning to form Double step DQN(Double Deep Q-learning Network)method to solve the problem of too high estimate.(2)In addition,the training of deep learning requires a large number of sample data,and the sample used in the data set will have a high correlation problem.In this paper,by adding a blend of different model of neural network is called a fusion model of neural network structure,the diversity of the neural network structure is different from sample data,and experience in playback mechanism in the process of sampling to reduce the correlation of the sample.The simulation results show that the double deep reinforcement learning algorithm not only produces more accurate estimation,but also improves the training stability.Moreover,the control strategy was successfully learned,and the scores in several video games were much higher than the original depth reinforcement learning.This suggests that the original DQN overestimated really learned isn't the best strategy,To reduce the excessive estimation is beneficial,at the same time,through the way of model integration to further improve the score in the depth of intensive study in video games. |