In 5-th Generation Mobile Communication(5G)systems,the increasing number of resourceconstrained Machine Type Communication devices(MTCD)poses unique technical challenges to cellular networks.In order to meet the strict latency and reliability requirements of MTCD in scenarios such as telemedicine and autonomous driving,it is particularly urgent to design more advanced random access technologies.At the same time,due to the scarcity of spectrum resources and the limited energy consumption of MTCD,the spectrum efficiency and energy efficiency of B5G(Beyond 5G,B5G)heterogeneous network are highly demanded.In order to solve the problem of insufficient MTC access capability and resource allocation,this thesis combines the idea of reinforcement learning to propose a learning-based access strategy for MTCD with low delay tolerance and a resource allocation scheme in heterogeneous networks.The work of this thesis is as follows:(1)To alleviate the problem of preamble collision caused by limited preamble resources in the random access process,an access scheme based on multi-agent reinforcement learning is proposed.In this thesis,a two-step random access process is introduced,and the base station can obtain the access number and collision probability of the current access slot.Different from the traditional static backoff mechanism,the optimal number of accesses and the obtained collision probability are firstly obtained to analyze the usage of each preamble,a reinforcement learning framework including access control and preamble resource allocation is designed to analyze the current environment before accessing,make backoff decisions,and select the appropriate preamble for random access.(2)About the black box issue in the deep neural network,a model-driven reinforcement learning based heterogeneous network resource allocation scheme is proposed.Firstly,the spectrum efficiency function is used as the objective optimization function and the power as the constraint.The alternating direction multiplier method is used to iteratively obtain the optimal solution.then combine the iterative process with DNN.Finally,a model-based reinforcement learning framework is designed to obtain the optimal resource allocation strategy based on the current state.(3)In order to ensure the quality of service of downlink users and improve the spectrum efficiency and energy efficiency of heterogeneous networks,a joint spectrum and power allocation algorithm based on multi-agent reinforcement learning is proposed.First,with spectrum utilization and energy efficiency as optimization goals and user service quality as constraints,the resource allocation optimization function is obtained.Second,the multi-agent state space,reward,and action space are defined.After that,users through small communication overhead obtain state space information and one-dimensional state-space data to reduce the amount of input data of the network.Users use their own channel state information(CSI)instead of global channel state information to obtain spectrum and power allocation strategies.Finally,users find the optimal resource allocation strategy by training DNN. |