| Collaborative Virtual Environment(CVE) is the product of the virtual reality technology combining with the network technology.It connects independent virtual reality systems distributed in different geographic locations through the network,and makes multi-Agents interact in a shared three-dimensional environment and achieve missions collaboratively.CVE systems have been widely applied in many areas,such as scientific visualization,Co-design,war simulation and so on.In a virtual environment,Agent can easily get lost,especially in large-scale complex virtual environment. It is not easy for the agents to adjust the direction by themselves to achieve the goal of navigation.In the field of artificial intelligence,an important goal is how to design an agent that can complete the task independently for a long time. It is very similar to the agent problem of self-navigation control in the collaborative virtual environment. Reinforcement learning theory as an important branch of the intelligent study,were developed from the control theory, statistics,psychology and related disciplines such as cognitive study,and it has a rather long history and has been researched widely in the smart study.This paper carried through an in-depth study and exploration on CVE and collaborative navigation.According to its own characteristics of CVE navigation,an improved navigation model of collaborative framework is put forward based on the single-user navigation model.According to the similarity of CVE collaborative navigation models and reinforcement learning theory model,after analyzing the basis of agent navigation control,the author applied the reinforcement learning to the collaborative virtual environment navigation control and mainly researched the navigation knowledge acquisition based on the Q-learning algorithm.In order to improve the application effect of the algorithm in the collaborative navigation control,this article proposed a Q-learning algorithm based on the shortest path,and constructed the absolute distance between the mobile agent of the virtual environment and its target into a status function of reinforcement learning.Through comparison of late and former statuses,a shortest path usually can be achieved. At the same time,the results of the study can be shared by multiple-agents,which can strengthen their perception of environmental information,learn the right decision-making more quickly,and make effective route-seeking and navigation control. |