| The Internet of Vehicles(IoV)is a significant research direction to solve future traffic problems and realize autonomous driving.CelluarVehicle to Everything(C-V2X)technology supports the communication among vehicles,infrastructure,pedestrain and networks with the help of existing cellular facilities,which can meet the services requirements of high transmission rate,wide coverage and low latency.However,the traditional resource allocation scheme is no longer suitable for the dynamic IoV.More efficient resource allocation algorithms are need for current research.In this paper,reinforcement learning is applied to IoV,and a reinforcement learning based resource allocation algorithm is designed to help vehicles make more intelligent resource selection.Firstly,this paper introduces the scenario classification and key technologies of C-V2X,and the wireless resource used for V2X communication and the research situation of allocation algorithm are introduced.Also,the C-V2X system level simulation platform is introduced.Secondly,a resource allocation algorithm based on deep reinforcement learning is designed for unicast communication.The agent takes the perceived link-channel information and interference situation as the state,selects the wireless channel and transmitting power as the action,takes the system capacity as the evaluation index.The resource allocation is modeled as a reinforcement learning problem.In this paper,Double Deep Q-Network(DDQN)algorithm is used,which is combined with deep learning to build a neural network,and obtain the mapping of state and optimal action so that the agent can independently choose wireless resources according to the environment state.Finally,this paper studies resource allocation in broadcast communication scenarios.Different from unicast communication,V2X broadcasting requires higher transmission reliability.Therefore,according to the characteristics of broadcast communication,the algorithm assigns different weight factors to the environmental information of multiple links,and takes the joint optimization of V2I communication system capacity and V2V broadcast reliability as the reward value.The agent constantly trains the neural network by interacting with the environment,so that it can make its own decision and choose the optimal channel and power resources.The simulation results demonstratet that,compared with the baseline and sensing based scheme,the devised algorithm improves the capacity of V2V and V2I systems significantly when unicast communication is performed.The proposed algorithm can also jointly optimize the V2I system capacity and V2V communication reliability in broadcast communication.The vehicle can choose resources autonomously by using the trained neural network,which saves the decision time.The algorithm in this paper adapts to a variety of scenarios and environmental parameters and has performance advantages. |