| Due to the enormous expansion in demand for mobile applications and IoT services like autonomous driving,smart cities,and virtual reality,there has been a significant increase in wireless application traffic.However,the present-day cellular networks are unable to provide the highspeed data rates and low service latency needed for a variety of applications,thus hampering the growth of new services.To address these issues,new technologies such as network slicing and heterogeneous networks have emerged.End-to-end network slicing spans the radio access network,core network,and transport network,satisfying customized needs for vertical industries.The introduction of network slicing technology has also brought many communication workflow design issues,especially when users move,as the changes in the communication environment can be extremely complex.Failure to maintain seamless handoff can greatly affect the user’s service experience and system resource utilization.Traditional handoff mechanisms are no longer sufficient to meet the requirements of network slicing in radio access networks.Furthermore,end-to-end network slicing requires customized network functions based on business needs,involving various resources such as communication,computing,and storage on wired and wireless networks.Optimizing the utilization of diversified network resources to meet user demands has become a major challenge for the implementation of network slicing.Reinforcement learning can interact with the environment in real-time,rapidly adapting to network heterogeneity and dynamics and playing an increasingly important role in decision-making and resource optimization fields.The main research achievements and contributions of this article are as follows:(1)This research develops the HetNet-MAPPO algorithm by merging multi-agent reinforcement learning methods for the scenario of user association and slicing resource allocation in heterogeneous networks.The algorithm considers users as agents and flexibly chooses base stations and network slices based on the real-time base station-network slicing resources and the signal-to-interference-plus-noise ratio of each base station,harmonizing user service advantages and handoff costs while achieving optimal network efficiency.(2)To ensure end-to-end communication quality in network slicing,a cross-domain orchestration framework based on SLA-guaranteed delay indicators and two latency equalization policies applied to orchestrators for dividing lower domain delay budgets are proposed.By combining DQN and pointer networks and conducting in-depth analysis of creating subslices in the radio access network and core network,the DDQN-PER and PN-SFC algorithms are designed.Simulation results show that the latency equalization policies can effectively guarantee users’ QoS and improve network capacity. |