| With the development of wireless communication networks,in order to solve the shortage of spectrum resources and the demands of terminal users for high speed and low latency,and to deal with the challenges of over-loaded macro base station centers and low spectrum efficiency in heterogeneous networks,the device-to-device technology is considered to be a potential next-generation communication technology because it has the advantages of direct communication between devices and does not need to be forwarded by the base station,which can effectively relieve the load of the base station,improve the spectral efficiency,and reduce the communication delay.At the same time,non-orthogonal multiple access technology is also considered to be the key technology to solve the shortage of communication resources and improve the quality of services for massive terminals in the six-generation.In the past,device-todevice communication technology uses multiplexing time-frequency domain resources for communication,which will bring co-frequency interference.With the introduction of non-orthogonal multiple access technology,new solutions have been introduced to introduce new power domain for multiplexing,improving system performance,but causing interference to cellular users and D2 D users themselves.This requires a rational design of resource allocation strategies that have a decisive impact on the system performance.The goal of this research is to jointly optimize the allocation of sub-channels and power in NOMA-enhanced-D2 D communication networks to improve the system throughput.Firstly,NOMA technology is introduced into the D2 D communication network,and the NOMA-D2 D communication system model is established.For the problem of uneven distribution of users and inter-user interference in a single cell,the signal superposition coding technique is used at the transmitter,and successive interference cancellation is performed at the receiver,while the principle of user fairness and the constraints of each system parameter are considered to establish the joint sub-channel and power optimization for resource allocation maximization.Secondly,for the difficulty of solving the resource allocation problem of MINLP directly,the dissection is simplified and an optimal branch-definition allocation method based on heuristic algorithm is proposed.For the problem that the heuristic algorithm searches for the optimal solution exhaustive search without strategy and the converges slow with poor performance.The search process cuts from the least cost node,associates with the branch nodes with constrains and nonlinear subproblems,prunes the infeasible nodes as well as non-optimal nodes in the search process.Further narrows the search scope,and makes backward global judgments to avoid falling into local optimum,so as to find the optimal solution to make the system and rate maximization.It is demonstrated through simulation experiments that the resource allocation algorithm in the NOMA-D2 D system proposed in this paper has significant advantages in improving the system and rate performance.Finally,to further speed up the convergence of the optimal branch bound allocation algorithm for better performance,imitation learning is added to learn the optimal pruning strategy.In order to make the algorithm have good generalization and improve the system throughput,the data structure and system parameters are considered in the feature label setting of supervised learning,and the data aggregation that can save space and improve generalization.Online learning method to achieve optimal joint sub-channel and power allocation.The experimental results show that the proposed accelerated optimal branch bound resource allocation algorithm based on imitation learning has good generalization and maximizes the system sum rate. |