| In recent years,with the help of deep neural networks,traditional machine learning has broken through many bottlenecks in computer vision,natural language processing and other fields,and has made remarkable achievements.Similarly,traditional reinforcement learning methods can cope with high-dimensional state and action space scenarios after embedding deep neural networks.For example: the classic Deep Q-Network(DQN)algorithm can learn strategies directly from game images,and has achieved unprecedented results in many Atari games.Reinforcement learning that uses deep neural networks to approximate values or strategies,that is,deep reinforcement learning,can automatically learn the mapping relationship between input states and output actions by minimizing target loss.Although the deep reinforcement learning model has outstanding performance in some specific tasks,because its learning process and working mechanism are not transparent,people cannot perceive the basis and reliability of its decision-making.The lack of interpretability makes the decisions made by the model questionable,which greatly limits the application scenarios of deep reinforcement learning.In addition,decisions that lack interpretability may cause harm in actual tasks.For example,an automatic force deployment model that lacks interpretability may generate an incorrect battle plan for our side and pose a serious threat to the life and safety of our soldiers.Therefore,it is necessary to improve the interpretability and transparency of the model.To this end,this paper studies the interpretability of deep reinforcement learning,trying to visualize the basis of neural network decision-making,and use it as the most intuitive explanation of deep reinforcement learning agent’s decision-making.The most commonly used visualization method for neural networks is the saliency map.According to the different model permissions that users have,this article proposes two saliency map generation algorithms as a method to explain the decision of the agent,and also proposes an interpretation enhancement module to enhance the interpretation effects of the first two methods,and design experiments to prove their availability.The main work of this paper is as follows:1.Research on the visualization method of perturbation-based deep neural network and the interpretation method of perturbation-based agent decision-making.On the basis of previous methods,a new perturbation-based saliency map generation algorithm is proposed.When faced with a trained model,the model structure and internal parameters are not known at all,make full use and transformation of the input to explore the decision basis of the black box model: The Gaussian fuzzy method is applied to different parts of the model input to obtain the difference between the action value of the original state and the disturbance state.Using this difference,starting from two dimensions,design a saliency map generation algorithm to calculate the influence of each part of the input on the decision result,and get the basis of the agent’s behavior.By designing a comparative experiment,it proves the importance of filtering in the new method and the improvement of the method to the work of the predecessors.At the same time,using this method to visualize the strategies of the model in different training stages provides more reference for the training rounds of the model.2.The visualization method of the gradient-based deep neural network is studied,and a gradient-based saliency map generation algorithm is proposed to explain the decision of the agent.When faced with a trained model,the model structure and internal parameters are known,starting from the last layer of the model,through the calculation of the gradient of the feature map,generate the weight of different feature maps relative to the saliency map.Use the weights that have a positive influence to weight the features captured in the feature map to form a positive interpretation of the current decision,and use the weights that have a negative impact on other categories to weight the features captured in the feature map to form a reverse interpretation of the current decision.The two together generate a saliency map for decisionmaking,and get the basis of the agent’s decision-making behavior.Experiments were designed to prove the effectiveness of the method through the strategy improvement of the visual agent;compared with the perturbation-based method,and summarized the advantages and disadvantages of the two.3.The deep reinforcement learning method incorporating attention is studied,and a deep reinforcement learning interpretation enhancement module based on attention is proposed.This module can be flexibly added to different deep reinforcement learning models,and participate in training together with the model,without affecting the efficiency of the original model,making it easier for the model to learn useful features and reducing the interference of useless features.Used in conjunction with the methods proposed in the previous two points,the interpretation effect can be enhanced.Design experiments and compare models without attention mechanism to verify the gain effect of the module on interpretation,and compare other models that introduce attention to verify the superiority of the module. |