| In recent years,5G network scenarios have become increasingly complex,and the network resource requirements of different network use cases are constantly changing in both time and space dimensions,which brings considerable challenges to network resource management.How to make reasonable use of limited resources to ensure the improvement of revenue while meeting the basic needs of users has become a key issue to stakeholders of network resources.The emergence of network function virtualization and slicing technology simplifies resource management,and gives birth to a method of resource management at the slice level.The centralized resource management method based on Deep Reinforcement Learning(Deep Reinforcement Learning,DRL)can realize the dynamic allocation of resource slices,but when faced with multi-operator scenarios,due to factors such as business competition,operators will protect data,making it impossible to directly share data between multi-operator to support the training and optimization of DRL model.In order to solve this problem and realize the expansion from centralized to distributed,we designs a distributed slice resource allocation scheme.With federated learning,we build a framework for distributed collaborative sharing framework,and design model adaptive algorithm combined with knowledge distillation to reduce the impact of heterogeneous scenarios on traditional federated learning.In the simulation,the experimental results of federated reinforcement learning show that federated learning can help the model converge faster in homogeneous scenarios.And the experimental results of federated distillation show that the federated distillation algorithm can alleviate the impact of data heterogeneity on the convergence and performance of the traditional federated average model.In addition,this thesis conducts further research on the slice resource allocation process between mobile virtual network operators(Mobile Virtual Network Operator,MVNOs)and users,and implements resource allocation between MVNOs and users in the form of transactions.Specifically,this paper uses game thinking as the base frame to construct a transaction process based on market agents;multiple MVNOs first publish the price of resources,and users give purchase strategies based on the price of each MVNO,and then the market agent determines the winner of this round of transactions.Then,a dynamic pricing strategy for MVNOs based on reinforcement learning is proposed to adapt to the continuous changes in the market.Finally,in the comparative experiment of the MVNO bidding algorithm,the proposed bidding strategy based on DDPG showed better results than random bidding and other classical reinforcement learning algorithms in terms of cumulative profit and order grabbing ability. |