With the development of Internet of things(Io T)and the popularization of 5-th Generation Mobile Communication(5G)network,the number of massive Machine Type Communication devices(MTCD)is increasing.MTCD data transmission is usually carried out in the way of small data volume and low frequency.In this mode,network resources often cannot meet a large number of small data volume transmission at the same time,and the simultaneous access of MTCD to the network will occupy a large number of network resources,leading to a shortage of network resources,and thus causing congestion.This will eventually lead to a series of problems such as increased network delay,data loss and retransmission,and network failures.Appropriate measures should be taken to alleviate network congestion.In order to solve the congestion problem during random access of Machine Type Communication(MTC)system,this thesis proposes access strategies based on reinforcement learning for different kinds of MTCDs,and the main achievements are as follows:(1)A Dueling Double Deep Q-network(D3QN)algorithm based on reinforcement learning is designed to solve the congestion problem during random access of massive MTCD.In multi-base station scenarios,a device can send an access request to any base station in its area.The algorithm in this thesis is based on a two-step random access process so that the base station can obtain the access number and conflict probability of the current access time slot.the device changes the reward in reinforcement learning through the number of conflicting leading codes broadcast by the base station,so that the MTC device can find the base station with less load for access and reduce the possible leading code conflicts.Based on Deep Q-network(DQN),D3 QN uses Double and Dueling methods to improve,and samples training data with preferential experience playback,which makes the algorithm converge faster and more stable.(2)Aiming at the congestion problem that occurs when different kinds of massive MTCD devices are randomly accessed in the multi-cell scenario,a queuing leading code allocation strategy based on reinforcement learning is designed.In this scheme,different kinds of devices are grouped and prioritized according to their delay tolerance times.In each access time slot,the lead code of the virtual lead pool is assigned to the ordered access queue.the agent selects the best base station and lead code queue for each device to access through interaction with the environment,and the devices queue up for access in turn to reduce the random access collision and improve the access success rate.(3)The learning model is trained using a federation learning approach based on queued access that further considers user security and privacy requirements.Through the average optimization of neural network gradient,the neural network of each agent is optimized synchronously.Each agent can learn from the experience of other agents,optimize the local model,form a better cooperation effect,and complete the task together.By means of federated learning training,the proposed method can simultaneously ensure the security of access process and the priority access of low delay tolerance devices. |