| With the improvement of unmanned aerial vehicle(UAV)hardware performance,more UAV application cases are gradually increasing.Compared with ground communication equipment,UAV has become a research hotspot in the field of communication with its advantages of flexible deployment,stable and reliable channel.In order to improve the performance of UAV communication system,various trajectory planning methods have been proposed.Traditional trajectory planning methods based on optimization are not suitable for unknown and changeable communication environments;Reinforcement learning(RL)has the advantage of not requiring global information and being able to quickly adapt to the environment.Researchers have begun to explore UAV trajectory planning technology based on RL.In trajectory learning,the trajectory results learned by RL agent depend on the environment they interact with.However,most existing researches only consider static ground users and wireless line of sight channels in the modeling of the environment.The modeling of the environment is greatly different from the actual communication environment of UAV.It makes the research results difficult to apply to actual scenes.In response to this issue,focusing on researchers studying the trajectory learning problem of UAV,this thesis further studies the distribution,movement,and communication processes of ground users in different terrains,and provides a realistic agent interaction environment.The innovation points and contributions of this article are summarized as follows:1.Designed an environment simulation and display platform for trajectory learning of UAV agent,including environment simulation module,communication module,parameter generation terrain module,and 3D visualization module.The environment simulation module divides the suburban environment into multiple local environments,and designs different ground user distribution and movement rules respectively;In the communication module,the influence of different local environments on communication channel is considered;In the parameter generation terrain module,according to the distribution rule of local environment,the generation rule of random map is designed;In the 3D visualization module,different display perspectives are set based on the display function.2.Build the platform based on Python and Unity,and verified the performance of the platform’s functions.Use Python to implement the functions of environment simulation module,communication module and parameter generation terrain module;Use Unity to implement the 3D visualization module and adjust the 3D model to improve visual experience.Using RL methods to study a classic communication UAV trajectory planning problem,the agent can obtain effective UAV trajectories,proving the availability of the platform;The interactive environment in the existing research is compared with the interactive environment with local environment considerations.The former has multiple collisions in the trajectory results,and the latter has collision free trajectory results,which verifies the validity of the platform. |