Font Size: a A A

Research On Virtual Crowd Path Planning Based On Deep Reinforcement Learning

Posted on:2021-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:J F ZhaiFull Text:PDF
GTID:2518306476957889Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,the increasing maturity of virtual reality technology and the continuous improvement of computer performance have laid solid foundations for the research and application of virtual crowd simulation technology.Virtual crowd simulation technology mainly includes real-time rendering technology,motion control technology and behavior control technology.Among them,path planning technology is one of the key technologies in motion control,and reflects the basic human behavior capabilities.Path planning technology has become one of the research hotspots in crowd simulation.However,most of the existing virtual crowd path planning methods are designed based on the known environment,which fail to meet the requirements of autonomous learning and adapting to the uncertain environment.Even if reinforcement learning method is adopted,it is easy to cause dimension disaster.The rise of deep reinforcement learning brings new opportunities for the research of path planning technology.Therefore,the research of the virtual crowd path planning in an unknown environment based on deep reinforcement learning has important theoretical significance and engineering application value.The main contents of this paper are as follows:(1)The reward function applied in the policy learning process is designed and a dynamic collision avoidance algorithm based on VO idea is integrated.The idea of collision cone is extended to collision probability,which is used to predict the collision with dynamic obstacles.It can make the virtual human avoid the collision with dynamic obstacles as early as possible.The priority of local collision avoidance and global planning is also considered in the reward function,and spending too long time finding the target will be punished,which can better meet the needs of path planning and improve the efficiency of policy training.(2)A deep reinforcement learning policy network based on PPO algorithm is designed.In the middle layer of the network,the short-term memory network is introduced.The input layer(state)includes the environment information,the virtual human's motion information and his goal.The ray projection method is introduced in the representation of the environment information.The output action of the output layer has the continuous feature,which is consistent with the human motion feature.And it realizes the policy sharing among virtual crowd,ensuring that one policy can meet the road finding needs of all virtual humans.(3)A two-stage crowd policy learning method is studied.In the first stage,PPO algorithm is used to train the virtual humans and those who have the ability of path planning are selected as the leaders.In the second stage,combined with RVO algorithm,the leaders guide a number of humans to move towards their targets.In this way,the path planning of virtual human groups is realized.(4)A virtual crowd path planning simulation platform based on deep reinforcement learning is constructed.The simulation platform is built based on unity,and the PPO algorithm is implemented through the external Python API.The relevant experiments are carried out by using the path planning method designed in this paper,realizing the path planning of virtual crowd in an unknown environment.The simulation results show that this method is effective and has certain advantages over other existing methods.
Keywords/Search Tags:path planning, deep reinforcement learning, dynamic collision avoidance, PPO, policy network
PDF Full Text Request
Related items