Font Size: a A A

Study On Emergency Escape Route Planning Based On Reinforcement Learning

Posted on:2021-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2518306524969619Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Path planning is an important research field in the field of artificial intelligence,which has been widely applied in many fields such as national defense,military,transportation and robot navigation.At present,a lot of research results have emerged in this field,but most of the existing studies are based on the artificial establishment of the environment,artificial provision of environmental data for the model to complete the training of the model.Reinforcement learning is a machine learning method that does not need to provide training data manually.With the development of deep learning in recent years,the method of deep reinforcement learning combining deep learning and reinforcement learning has been greatly developed and applied.The emergence of Alpha Go and Aopha Zero demonstrates the broad application prospect of deep reinforcement learning.In this paper,the method of deep reinforcement learning is applied in the field of emergency escape path planning.Firstly,this paper proposes a distributed priority experience substitution strategy for deep reinforcement learning.The strategy adopts centralized learning and distributed execution training,which not only improves the training speed,but also ensures that the samples in the memory bank are more representative.Moreover the strategy through training sample data in the model of the loss of value of the weight of sample data as sample data,according to the weights of the small root pile of data structure was used to construct memory bank,as an agent to perform an action in the environment,constantly replace with the new sample data of sample on the top of the heap memory,and the top of the heap is always the highest priority value through the independent structural adjustment of the small root heap,namely the lowest value relative to the model of data,in this way to ensure the sample of the model has high value in memory.Then the priority sampling method based on heap ordinal number is adopted in this strategy to solve the problem that the model training process is easily influenced by individual abnormal data.Secondly,a DDPG path planning algorithm combined with LSTM is proposed.The algorithm takes the environmental image as the input to retain the original characteristic information of the environment to the maximum extent,encodes the environmental image with dimension reduction through the pre-trained image encoder,and then transfers it to the following process to perform subsequent operations.Through deep reinforcement learning framework DDPG combine LSTM network,makes the algorithm is the process can deal with the dynamic changes of the environment,with the continuous image data frames as a state of reinforcement learning problems,at the same time as the time sequence data as input of the algorithm,the result of a model will get a forward sequence information,realize the more efficient action choice.DDPG framework is adopted as the strategy of environment reward and action selection to realize dynamic path planning based on environment prediction.Finally,this paper combines the Unity 3D engine to build a simulation platform for emergency evacuation path planning,and completes the control of entities in the environment through object-oriented design mode.The DPES strategy and LSTM-DDPG algorithm proposed in this paper were tested on the simulation platform,which verified the availability of the platform and further proved the ability of the strategy and method proposed in this paper in solving the problem of emergency escape path planning.
Keywords/Search Tags:Reinforcement Learning, Deep Reinforcement Learning, LSTM, DDPG, Path Planning
PDF Full Text Request
Related items